The Impact of Adopting Continuous Integration on the Delivery Time of Merged Pull Requests: an Empirical Study

Total Page:16

File Type:pdf, Size:1020Kb

The Impact of Adopting Continuous Integration on the Delivery Time of Merged Pull Requests: an Empirical Study Federal University of Rio Grande do Norte Center of Exact and Earth Sciences Department of Informatics and Applied Mathematics Graduate Program in Systems and Computing Academic Master’s Degree in Systems and Computing The Impact of Adopting Continuous Integration on the Delivery Time of Merged Pull Requests: An Empirical Study João Helis Junior de Azevedo Bernardo Natal, Brazil July, 2017 João Helis Junior de Azevedo Bernardo The Impact of Adopting Continuous Integration on the Delivery Time of Merged Pull Requests: An Empirical Study A dissertation submitted to the Computer Science Graduation Program of the Cen- ter of Exact and Earth Sciences in confor- mity with the requirements for the Degree of Master in Systems and Computing. PPgSC - Graduate Program in Systems and Computing DIMAp - Department of Informatics and Applied Mathematics UFRN - Federal University of Rio Grande do Norte Advisor: Uirá Kulesza Co-Advisor: Daniel Alencar da Costa Natal, Brazil July, 2017 Catalogação da Publicação na Fonte. UFRN / SISBI / Biblioteca Setorial Especializada do Centro de Ciências Exatas e da Terra – CCET. Bernardo, João Helis Junior de Azevedo. The impact of adopting continuous integration on the delivery time of merged pull requests: an empirical study / João Helis Junior de Azevedo Bernardo. – Natal, RN, 2017. 96 f.: il. Orientador: Prof. Dr. Uirá Kulesza. Coorientador: Prof. Dr. Daniel Alencar da Costa. Dissertação (mestrado) – Universidade Federal do Rio Grande do Norte. Centro de Ciências Exatas e da Terra. Departamento de Informática e Matemática Aplicada. Programa de Pós-Graduação em Sistemas e Computação. 1. Engenharia de software – Dissertação. 2. Integração contínua – Dissertação. 3. Desenvolvimento baseado em pull requests – Dissertação. 4. Pull request – Dissertação. 5. Tempo de entrega – Dissertação. 6. Atraso de entrega – Dissertação. 7. Mineração de repositórios de software – Dissertação. I. Kulesza, Uirá. II. Costa, Daniel Alencar da. III. Título. RN/UF/BSE-CCET CDU 004.41 Acknowledgements First and foremost, I would like to thank God, the Almighty, for giving me the strength and support in all this quest for knowledge, especially for showing me the way forward in the most difficult moments of my life. Without His blessings, I certainly would not have got here. My deep gratitude to my parents, João Helis Bernardo and Rosilda de Azevedo Bernardo, and to my sister Juliana Raffaely de Azevedo Bernardo, without their love, dedication and support in all single part of my life, I would not be who I am. Thanks for teaching me that I can never give up on my dreams. I would like to express my deepest gratitude and special thanks to my girlfriend Milenna Veríssimo, for her love, support and constant patience. Thanks for always encourage me to be a better man. I love you. I would like to express my extreme sincere gratitude to my advisor Uirá Kulesza, who gave me the opportunity to work with him, and expertly guided me on the path that I walked during my master’s degree. I would also like to thank my co-advisor and friend Daniel Alencar da Costa, for mentoring me and provide me all support that I needed to conduct the studies that we performed in this dissertation. Without his precious guidance, I could not be able to achieve the state of this work. I would like to extend my appreciation to my laboratory colleagues, Leo Moreira, Fabio Penha, and Eduardo Nascimento who helped to make lighter the pressures that we were facing together on the final stages of our master’s degree, by providing moments of sharing knowledge and fun through the so-called "coffee time". Ultimately, I am very grateful to CNPq for the financial support. Society must learn that we Indians can and should use technology and information in our everyday activities. That doesn’t make us any less Indians. Being Indian is in the blood that flows through our veins, not in clothing and utensils that we use or any external characteristic. Abstract Continuous Integration (CI) is a software development practice that leads developers to integrate their work more frequently. Software projects have broadly adopted CI to ship new releases more frequently and to improve code integration. The adoption of CI is usually motivated by the allure of delivering new software content more quickly and frequently. However, there is little empirical evidence to support such claims. Over the last years, many available software projects from social coding environments such as GitHub have adopted the CI practice using CI facilities that are integrated in these environments (e.g., Travis-CI). In this dissertation, we empirically investigate the impact of adopting CI on the time-to-delivery of pull requests (PRs), through the analysis of 167,037 PRs of 90 GitHub projects that are implemented in 5 different programming languages. On analyzing the percentage of merged PRs per project that missed at least one release prior being delivered to the end users, the results show that before adopting CI, a median of 13.8% of merged PRs are postponed by at least one release, while after adopting CI, a median of 24% of merged PRs have their delivery postponed to future releases. Contrary to what one might speculate, we find that PRs tend to wait longer to be delivered after the adoption of CI in the majority (53%) of the studied projects. The large increase of PR submissions after CI is a key reason as to why these projects deliver PRs more slowly after adopting CI. 77.8% of the projects increase the rate of PR submissions after adopting CI. To investigate the factors that are related to the time-to-delivery of merged PRs, we train linear and logistic regression models, which obtain sound median R-squares of 0.72-0.74, and good median AUC values of 0.85-0.90. A deeper analysis of our models suggests that, before and after the adoption of CI, the intensity of code contributions to a release may increase the delivery time due to a higher integration-load (in terms of integrated commits) of the development team. Finally, we are able to accurately identify merged pull requests that have a prolonged delivery time. Our regression models obtained median AUC values of 0.92 to 0.97. Keywords: Continuous Integration; Pull-based Development; Pull Request; Delivery Time; Delivery Delay; Mining Software Repositories. Resumo A Integração Contínua (IC) é uma prática de desenvolvimento de software que leva os desenvolvedores a integrarem seu código-fonte mais frequentemente. Projetos de software têm adotado amplamente a IC com o intuito de melhorar a integração de código e lançar novas releases mais rapidamente para os seus usuários. A adoção da IC é usualmente motivada pela atração de entregar novas funcionalidades do software de forma mais rápida e frequente. Todavia, há poucas evidências empíricas para justificar tais alegações. Ao longo dos últimos anos, muitos projetos de software disponíveis em ambientes de codificação social, como o GitHub, tem adotado a prática da IC usando serviços que podem ser facilmente integrados nesses ambientes (por exemplo, Travis-CI). Esta dissertação investiga empiricamente o impacto da adoção da IC no tempo de entrega de pull requests (PRs), através da análise de 167.037 PRs de 90 projetos do GitHub que são implementados em 5 linguagens de programação diferentes. Ao analisar a porcentagem de merged PRs por projeto que perderam pelo menos uma release antes de serem entregues aos usuários finais, os resultados mostraram que antes da adoção da IC, em mediana 13.8% dos merged PRs tem sua entrega adiada por pelo menos um release, enquanto que após a adoção da IC, em mediana 24% dos merged PRs tem sua entrega adiada para futuras releases. Ao contrário do que se pode especular, observou-se que PRs tendem a esperar mais tempo para serem entregues após a adoção da IC na maioria (53%) dos projetos investigados. O grande aumento das submissões de PRs após a IC é uma razão fundamental para que projetos demorem mais tempo para entregar PRs depois da adoção da IC. 77,8% dos projetos aumentam a taxa de submissões de PRs após a adoção da IC. Com o propósito de investigar os fatores relacionados ao tempo de entrega de merged PRs, treinou-se modelos de regressão linear e logística, os quais obtiveram R-Quadrado mediano de 0.72-0.74 e bons valores medianos de AUC de 0.85-0.90. Análises mais profundas de nossos modelos sugerem que, antes e depois da adoção da IC, a intensidade das contribuições de código para uma release pode aumentar o tempo de entrega de PRs devido a uma maior carga de integração (em termos de commits integrados) da equipe de desenvolvimento. Finalmente, apresentamos heurísticas capazes de identificar com precisão os PRs que possuem um tempo de entrega prolongado. Nossos modelos de regressão obtiveram valores de AUC mediano de 0.92 a 0.97. Palavras-chave: Integração Contínua; Desenvolvimento Baseado em Pull Requests; Pull Request; Tempo de Entrega; Atraso de Entrega; Mineração de Repositórios de Software. List of Figures Figure 1 – An overview of the scope of the dissertation............... 17 Figure 2 – An overview of the pull-based development model that is integrated with Continuous Integration. ...................... 22 Figure 3 – An illustrative example of how we compute delivery time in terms of days.................................... 25 Figure 4 – An illustrative example of how we compute delivery time in terms of releases.................................. 26 Figure 5 – The basic life-cycle of a released pull request.............. 29 Figure 6 – Training Linear and Logistic Regression Models. ........... 33 Figure 7 – Percentage of merged pull requests that have a long delivery time. 38 Figure 8 – An overview of our project selection process. ............. 41 Figure 9 – Number of projects grouped by programming language.
Recommended publications
  • What Is Aloha Editor? What Aloha Editor Is NOT! Why the HTML5
    06.12.11 Aloha Editor - HTML5 WYSIWYG Editor Features Plugins About Forum Try it Developer Wiki API What is Aloha Editor? Aloha Editor Core Developer Aloha Editor is a semantic Rich Text Editor framework written in Javascript with best support of xHTML5. You can Haymo Meran integrate it in a CMS, blog, wiki software or any other Project leader project where you need to edit content with a web based Italy (South Tirol) tool. Use it like a supersonic textarea. You can edit Clemens Prerovsky almost any DOM element with this supersonic textarea, but Javascript Guru you need to care about configuration and storage. Austria What Aloha Editor is NOT! Nicolas Karageuzian Aloha Editor is NOT a HTML Editor to install on a desktop nor it is a CMS. Javascript Wizard France Why the HTML5 Editor? Thomas Lété Aloha Editor works perfectly in all current Speed and browsers and websites using HTML4! It is optimisations Guru. the HTML5 Editor because it implements Belgium all functionality respecting with no compromise the HTML5 specification. It Norbert Pomaroli aims to generate full compliant xHTML5 content, but you don't need to Javascript Guru use HTML5 specific methods, if you don't want to. The Aloha Editor uses Austria and extends the attribute contenteditable, which was introduced in the HTML standard with version 5. Most current browsers support that Petro Salema attribute, but their implementation is inconsitent due to erroneous Javascript Guru implementations and the lack of detail of the HTML5 specification when it Tanzania comes to DOM manipulation. Aloha Editor makes contenteditable useable.
    [Show full text]
  • Tecnologías Libres Para La Traducción Y Su Evaluación
    FACULTAD DE CIENCIAS HUMANAS Y SOCIALES DEPARTAMENTO DE TRADUCCIÓN Y COMUNICACIÓN Tecnologías libres para la traducción y su evaluación Presentado por: Silvia Andrea Flórez Giraldo Dirigido por: Dra. Amparo Alcina Caudet Universitat Jaume I Castellón de la Plana, diciembre de 2012 AGRADECIMIENTOS Quiero agradecer muy especialmente a la Dra. Amparo Alcina, directora de esta tesis, en primer lugar por haberme acogido en el máster Tecnoloc y el grupo de investigación TecnoLeTTra y por haberme animado luego a continuar con mi investigación como proyecto de doctorado. Sus sugerencias y comentarios fueron fundamentales para el desarrollo de esta tesis. Agradezco también al Dr. Grabriel Quiroz, quien como profesor durante mi último año en la Licenciatura en Traducción en la Universidad de Antioquia (Medellín, Colombia) despertó mi interés por la informática aplicada a la traducción. De igual manera, agradezco a mis estudiantes de Traducción Asistida por Computador en la misma universidad por interesarse en el software libre y por motivarme a buscar herramientas alternativas que pudiéramos utilizar en clase sin tener que depender de versiones de demostración ni recurrir a la piratería. A mi colega Pedro, que comparte conmigo el interés por la informática aplicada a la traducción y por el software libre, le agradezco la oportunidad de llevar la teoría a la práctica profesional durante todos estos años. Quisiera agradecer a Esperanza, Anna, Verónica y Ewelina, compañeras de aventuras en la UJI, por haber sido mi grupo de apoyo y estar siempre ahí para escucharme en los momentos más difíciles. Mis más sinceros agradecimientos también a María por ser esa voz de aliento y cordura que necesitaba escuchar para seguir adelante y llegar a feliz término con este proyecto.
    [Show full text]
  • Revisiting XSS Sanitization
    Revisiting XSS Sanitization Ashar Javed Chair for Network and Data Security Horst G¨ortzInstitute for IT-Security, Ruhr-University Bochum [email protected] Abstract. Cross-Site Scripting (XSS) | around fourteen years old vul- nerability is still on the rise and a continuous threat to the web applica- tions. Only last year, 150505 defacements (this is a least, an XSS can do) have been reported and archived in Zone-H (a cybercrime archive)1. The online WYSIWYG (What You See Is What You Get) or rich-text editors are now a days an essential component of the web applications. They allow users of web applications to edit and enter HTML rich text (i.e., formatted text, images, links and videos etc) inside the web browser window. The web applications use WYSIWYG editors as a part of comment functionality, private messaging among users of applications, blogs, notes, forums post, spellcheck as-you-type, ticketing feature, and other online services. The XSS in WYSIWYG editors is considered more dangerous and exploitable because the user-supplied rich-text con- tents (may be dangerous) are viewable by other users of web applications. In this paper, we present a security analysis of twenty five (25) pop- ular WYSIWYG editors powering thousands of web sites. The anal- ysis includes WYSIWYG editors like Enterprise TinyMCE, EditLive, Lithium, Jive, TinyMCE, PHP HTML Editor, markItUp! universal markup jQuery editor, FreeTextBox (popular ASP.NET editor), Froala Editor, elRTE, and CKEditor. At the same time, we also analyze rich-text ed- itors available on very popular sites like Twitter, Yahoo Mail, Amazon, GitHub and Magento and many more.
    [Show full text]
  • Prioritizing Pull Requests
    Prioritizing pull requests Version of June 17, 2015 Erik van der Veen Prioritizing pull requests THESIS submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in COMPUTER SCIENCE by Erik van der Veen born in Voorburg, the Netherlands Software Engineering Research Group Q42 Department of Software Technology Waldorpstraat 17F Faculty EEMCS, Delft University of Technology 2521 CA Delft, the Netherlands The Hague, the Netherlands www.ewi.tudelft.nl www.q42.com c 2014 Erik van der Veen. Cover picture: Finding the pull request that needs the most attention. Prioritizing pull requests Author: Erik van der Veen Student id: 1509381 Email: [email protected] Abstract Previous work showed that in the pull-based development model integrators face challenges with regard to prioritizing work in the face of multiple concurrent pull requests. We identified the manual prioritization heuristics applied by integrators and ex- tracted features from these heuristics. The features are used to train a machine learning model, which is capable of predicting a pull request’s importance. The importance is then used to create a prioritized order of the pull requests. Our main contribution is the design and initial implementation of a prototype service, called PRioritizer, which automatically prioritizes pull requests. The service works like a priority inbox for pull requests, recommending the top pull requests the project owner should focus on. It keeps the pull request list up-to-date when pull requests are merged or closed. In addition, the service provides functionality that GitHub is currently lacking. We implemented pairwise pull request conflict detection and several new filter and sorting options e.g.
    [Show full text]
  • USER MANUAL PROJECT WEBSITES Version 2 March 2017 TAKING COOPERATION FORWARD
    USER MANUAL PROJECT WEBSITES Version 2 March 2017 TAKING COOPERATION FORWARD Content INTRODUCTION ........................................................................................... 4 HOW TO USE THIS MANUAL ............................................................................ 5 A. ACCESS AND USER PROFILE ......................................................................... 6 A.1. Log in ......................................................................................... 6 A.2. User profile .................................................................................. 6 A.3. Interface ..................................................................................... 8 B. WORKING WITH CMS ................................................................................10 B.1. Working with folders ......................................................................10 B.1.1. Folder structure and view ........................................................11 B.1.2. Creating folder .....................................................................12 B.1.3. Renaming folder ...................................................................13 B.1.4. Moving folder ......................................................................13 B.1.5. Deleting folder .....................................................................14 B.1.6. Order of folders ...................................................................14 B.2. Working with pages ........................................................................15
    [Show full text]
  • Website and Ektron 9.1
    WEBSITE AND EKTRON 9.1 TRAINING AND REFERENCE DOCUMENTATION Created by Hendrix College 1 Last updated September 18, 2014 TABLE OF CONTENTS Introduction ................................................................................................................................................................... 7 Your role in the Hendrix website ............................................................................................................................... 7 Orientation .................................................................................................................................................................... 8 Parts of a web page ................................................................................................................................................... 8 Page Content ......................................................................................................................................................... 8 Section Menu ......................................................................................................................................................... 8 Header and Footer ................................................................................................................................................. 8 Types of Content........................................................................................................................................................ 8 Content .................................................................................................................................................................
    [Show full text]
  • Best of a Decade on Opensource.Com 2010–2019
    Best of a decade on Opensource.com 2010–2019 In celebration of our 10-year anniversary Opensource.com/yearbook FROM THE EDITOR ............................. FROM THE EDITOR ............................. Dear reader, As we celebrate 10 years of publishing, our focus is on the people from all over the globe, in various roles, from diverse backgrounds, who have helped us explore the multitude of ways in which open source can improve our lives—from technology and programming to farming and design, and so much more. We are celebrating you because we’ve learned that growing this unique storytelling site demands that we do one thing better than all the rest: listen to and talk with our readers and writers. Over the years, we’ve gotten better at it. We regularly hold meetings where we review how articles performed with readers from the week before and discuss why we think that’s so. We brainstorm and pitch new and exciting article ideas to our writer community on a weekly basis. And we build and nurture close relationships with many writers who publish articles for us every month. As an editor, I never would have imagined my biggest responsibility would be community management and relationship building over copy editing and calendar planning. I’m so grateful for this because it’s made being a part of Opensource.com a deeply rewarding experience. In December, we closed out a decade of publishing by reaching a new, all-time record of over 2 million reads and over 1 million readers. For us, this validates and affirms the value we’ve learned to place on relationships with people in a world swirling with metrics and trends.
    [Show full text]
  • Social Media Apps Programming Mobile Apps Using HTML5/CSS3/Javascript
    Tamkang University Social Media Apps Programming Mobile Apps Using HTML5/CSS3/JavaScript 1041SMAP05 TLMXM1A (8687) (M2143) (Fall 2015) (MIS MBA) (2 Credits, Elec9ve) [Full English Course] Wed 9,10 (16:10-18:00) B310 Min-Yuh Day, Ph.D. Assistant Professor Department of Informaon Management Tamkang University h?p://mail.tku.edu.tw/myday 2015-10-14 Course Schedule (1/3) Week Date Subject/Topics 1 2015/09/16 Course Orientaon and Introduc9on to Social Media and Mobile Apps Programming 2 2015/09/23 Introduc9on to Android / iOS Apps Programming 3 2015/09/30 Developing Android Nave Apps with Java (Android Studio) (MIT App Inventor) 4 2015/10/07 Developing iPhone / iPad Nave Apps with Swia (XCode) 5 2015/10/14 Mobile Apps using HTML5/CSS3/JavaScript 6 2015/10/21 jQuery Mobile 2 Course Schedule (2/3) Week Date Subject/Topics 7 2015/10/28 Create Hybrid Apps with Phonegap 8 2015/11/04 jQuery Mobile/Phonegap 9 2015/11/11 jQuery Mobile/Phonegap 10 2015/11/18 Midterm Exam Week (Midterm Project Report) 11 2015/11/25 Case Study on Social Media Apps Programming and Marke9ng in Google Play and App Store 12 2015/12/02 Google Cloud Plaorm 3 Course Schedule (3/3) Week Date Subject/Topics 13 2015/12/09 Google App Engine 14 2015/12/16 Google Map API 15 2015/12/23 Facebook API (Facebook JavaScript SDK) (Integrate Facebook with iOS/Android Apps) 16 2015/12/30 Twi?er API 17 2016/01/06 Final Project Presentaon 18 2016/01/13 Final Exam Week (Final Project Presentaon) 4 Outline • Mobile Apps • HTML5 – Hyper Text Markup Language (version 5)(2014) – Content and Structure
    [Show full text]
  • Neos CMS Documentation Release Dev-Master
    Neos CMS Documentation Release dev-master The Neos Team June 05, 2015 Contents 1 Getting Started 3 1.1 Installation..............................................3 1.2 User Interface Basics.........................................4 2 Integrator Guide 11 2.1 Content Structure........................................... 11 2.2 Rendering A Page.......................................... 27 2.3 Inside TypoScript........................................... 28 2.4 Eel, FlowQuery and Fizzle...................................... 35 2.5 Adjusting Neos Output........................................ 39 2.6 Creating Custom Content Elements................................. 44 2.7 Content Cache............................................ 51 2.8 Node Type Constraints........................................ 56 2.9 Creating a plugin........................................... 57 2.10 Interaction with the Neos backend.................................. 60 2.11 Content Dimensions......................................... 61 2.12 Data sources............................................. 64 3 Integrators Cookbook 67 3.1 Wrapping a List of Content Elements................................ 67 3.2 Creating a simple Content Element................................. 68 3.3 Customizing the Inspector...................................... 70 3.4 Selecting a Page Layout....................................... 72 3.5 Changing the Body Class with a condition............................. 73 3.6 Rendering a Menu.......................................... 74 3.7 Integrating a JavaScript-based
    [Show full text]
  • A2040-442 Exam Dumps with Real Exam Questions
    IBM A2040-442 Assessment- IBM Notes and Domino Fundamentals http://killexams.com/exam-detail/A2040-442 QUESTION: 185 What can an IBM Domino administrator use instead of creating multiple explicit policy documents? A. Server grouppolicies B. Dynamic group policies C. Database group policies D. Intra-domain group policies Answer: A QUESTION: 186 How can an IBM Domino administrator provide access to a help desk team to only modify Person documents in the Domino Directory? A. by giving Editor access in the ACL of the Domino Directory B. by giving Editor access in the ACL of the Domino Directory and checking the User Modifier Role C. by giving Author access in the ACL of the Domino Directory and checking the UserModifier Role D. by giving Editor access in the ACL of the Domino Directory and adding Administrator Access in security tab of the Server document Answer: C QUESTION: 187 How can Cheryl be sure that her Recent Contacts has the most up-to-date information? A. She must manually update the entry in Recent Contacts B. Recent Contacts are automatically maintained with up-to-date detail C. She must use the name in an email chat or calendar entry in order for an update to occur D. She should select to synchronize Recent Contacts with her server's directory on the replication page Answer: B 62 QUESTION: 188 When a Rich Text Core Control is rendered in a browser what technology is used? A. ckEditor B. TinyMCE C. WYMeditor D. Aloha Editor Answer: A Reference: http://www- 10.lotus.com/ldd/ddwiki.nsf/dx/CKEditor_in_XPages_in_NotesDomino_8.5.2 QUESTION: 189 What are two platforms that IBM Domino server supports? A.
    [Show full text]
  • Advanced Features of Collaborative Semantic Annotators -- the 4A System
    Proceedings of the Twenty-Eighth International Florida Artificial Intelligence Research Society Conference Advanced Features of Collaborative Semantic Annotators — The 4A System Pavel Smrz and Jaroslav Dytrych Brno University of Technology, Faculty of Information Technology, IT4Innovations Centre of Excellence Bozetechova 2, 612 66 Brno, Czech Republic Email: fsmrz,idytrychg@fit.vutbr.cz Abstract museum professionals involved in the project. Yet, the 4A framework is generally applicable in other knowledge engi- This paper deals with collaborative knowledge engineering, particularly focusing on collective editing and semantic an- neering contexts, e.g., for biomedical text annotation. notation of hypertext. It discusses state-of-the-art functions of There are three particular cases in which the preferable the 4A (Annotations Anywhere, Annotations Anytime) sys- text mining scenario cannot be (fully) applied. First, a vari- tem that has been recently extended to be applicable in a ability of natural language constructs to express a seman- broad range of annotation contexts. We introduce advanced tic relation can be high and there can be insufficient data to features and recent improvements that make the tool unique train a machine learning model. For example, relations of in many aspects. A special attention is paid to the social way artistic influences (among artists, artworks, themes, styles, of semantic tagging – complex annotations can be created by techniques, places, etc.) have been studied within the DECI- a single click and immediately shared with other interested PHER project and it showed up that despite the effort, ex- users or reused by external systems. We also compare the 4A system to similar software solutions and show their similari- pressions such as pays tribute/homage to are not well cov- ties and differences.
    [Show full text]
  • DECIPHER Semantic Annotator Deliverable Number: D4.3.1 Deliverable Date: 31 August 2013 Partners Responsible: but Status: X Public ! Restricted ! Confidential
    DELIVERABLE SUBMISSION SHEET To: Rainer TYPKE (Project Officer) EUROPEAN COMMISSION Directorate-General Information Society and Media EUFO L-2920 Luxembourg From: Eoin Kilfeather Project acronym: Decipher Project number: 270001 Project manager: Project coordinator Eoin Kilfeather The following deliverable: Deliverable title: DECIPHER Semantic Annotator Deliverable number: D4.3.1 Deliverable date: 31 August 2013 Partners responsible: BUT Status: x Public ! Restricted ! Confidential is now ! It is available for your inspection. complete. x Relevant descriptive documents are attached. The deliverable is: x a document ! a Website (URL:) ! software (...........................) ! an event ! other (...........................) Sent to Project Officer: Sent to functional mail box: On date: [email protected]. [email protected] 06 November eu 2013 ! ! FP7-270001-Decipher DECIPHER Semantic Annotator Document identifier: Decipher-D4.3.1-WP4-BUT Semantic Annotator-PU Project title: Digital Environment for Cultural Interfaces; Promoting Heritage, Education and Research Project acronym: DECIPHER Project identifier: FP7-270001-Decipher Partners: Dublin Institute of Technology National Gallery of Ireland Irish Museum of Modern Art Open University System Simulation Limited Brno University of Technology Royal Irish Academy Author(s): Pavel Smrz, Lubomir Otrusina, Jan Kouril, and Jaroslav Dytrych Brno University of Technology Version: v01 Type: Report (Deliverable D4.3.1) Report availability: Public Date: August 31, 2013 Decipher-D4.3.1-WP4-BUT Semantic Annotator-PU FP7-270001-Decipher Contents 1 Introduction 1 2 Individual data sources and their processing 3 2.1 Getty ULAN . 3 2.2 Freebase . 4 2.3 Wikipedia/DBpedia . 5 2.4 Geonames . 8 3 Semantic integration of datasets 9 3.1 Data consistency issues and overlaps .
    [Show full text]