QUALITY ASSURANCE in TERMINOLOGY MANAGEMENT Recommendations from the Termfactory Project
Total Page:16
File Type:pdf, Size:1020Kb
Igor Kudashev QUALITY ASSURANCE IN TERMINOLOGY MANAGEMENT Recommendations from the TermFactory project Helsinki 2013 Igor Kudashev holds a PhD degree and the title of associate professor from the University of Helsinki, Finland. He is an author or co-author of several published terminological dictionaries, including the Finnish-Russian Forestry Dictionary, which was awarded the International Terminology Award for Applied Research and Development by the European Association for Terminology (EAFT) in 2008. The author has also developed MyTerMS, an in-house terminology management system. Experience obtained in these projects proved very valuable when addressing such a challenging topic as quality assurance in terminology management and collaborative content creation. This guide provides recommendations on quality assurance in terminology manage- ment as well as practical quality assurance tools from the TermFactory research and development project. Launched at the University of Helsinki in 2008, the TermFactory project is developing an ontology-underpinned platform for collaborative terminology work. The guide is addressed to a wide range of experts engaged in terminology work and terminology management. ISBN 978-952-10-7756-2 (paperback) ISBN 978-952-10-7757-9 (PDF) Unigrafia 2013 9 789521 077562 Igor Kudashev QUALITY ASSURANCE IN TERMINOLOGY MANAGEMENT Recommendations from the TermFactory project Helsinki 2013 Igor Kudashev QUALITY ASSURANCE IN TERMINOLOGY MANAGEMENT: RECOMMENDATIONS FROM THE TERMFACTORY PROJECT Responsible Organization: University of Helsinki, Palmenia Centre for Continuing Education, Kouvola Copyright © 2013 Igor Kudashev Commercial use of this publication is prohibited without written permission from the author. Cover design: Igor Kudashev and Hanna Sario Back cover photograph by Irina Kudasheva ISBN 978-952-10-7756-2 (paperback) ISBN 978-952-10-7757-9 (PDF) UNIGRAFIA HELSINKI 2013 TABLE OF CONTENTS PREFACE ...................................................................................................................7 ACKNOWLEDGEMENTS...........................................................................................9 1 OVERVIEW OF THE TERMFACTORY PLATFORM .....................................10 1.1 ONTOLOGY-BASED ARCHITECTURE .....................................................................10 1.2 SUPPORT FOR COLLABORATIVE CONTENT CREATION.............................................12 1.3 A DISTRIBUTED AND CLOUD-READY SYSTEM.........................................................13 2 QUALITY ASSURANCE INFRASTRUCTURE ..............................................14 2.1 DEFINITION OF BASIC CONCEPTS.........................................................................14 2.2 CLASSIFICATION OF ELEMENTS OF QUALITY ASSURANCE INFRASTRUCTURE ............15 2.2.1 Methodological data........................................................................................16 2.2.2 Structural metadata.........................................................................................16 2.2.3 Descriptive metadata ......................................................................................17 2.3 ELEMENTS OF QUALITY ASSURANCE INFRASTRUCTURE INCLUDED IN THIS GUIDE .....19 3 METHODOLOGICAL ASPECTS OF QUALITY ASSURANCE IN COLLABORATIVE TERMINOLOGY WORK ............................................22 3.1 WORKFLOWS AND USER ROLES IN COLLABORATIVE TERMINOLOGY WORK ...............22 3.1.1 Workflows in traditional terminology work.......................................................22 3.3.2 Comparison of traditional and collaborative content creation..........................27 3.3.3 Benefits and challenges of collaborative terminology work.............................32 3.3.4 Recommendations on the organization of collaborative terminology work on the TermFactory platform...........................................................................39 3.2 OBJECTS OF TERMINOLOGICAL DESCRIPTION IN A TERM BANK................................48 3.2.1 General requirements for objects of description in a term bank......................48 3.2.2 LSP designations that can be objects of description in a term bank...............49 3.2.3 LSP units that are unlikely to be objects of description in a term bank ...........61 3.2.4 ‘Impurities’ in term banks ................................................................................62 3.2.5 Forms of LSP designations.............................................................................64 3.2.6 Terminological lexeme....................................................................................70 3 4 STRUCTURAL ASPECTS OF QUALITY ASSURANCE IN TERMINOLOGY MANAGEMENT............................................................. 72 4.1 LANGUAGE IDENTIFICATION AND INDICATION ........................................................ 72 4.2 CHARACTER ENCODING..................................................................................... 75 4.3 COLLATION....................................................................................................... 77 4.3.1 Standard-based customization of collation rules ............................................ 78 4.3.2 Extensions to the collation customization table .............................................. 79 4.4 DOMAIN CLASSIFICATION ................................................................................... 84 4.4.1 Reasons for using domain classification in a term bank................................. 84 4.4.2 Overview of existing domain classifications (case: Finland)........................... 85 4.4.3 Overview of domain classifications used in term banks ................................. 86 4.4.4 Problems with existing domain classifications................................................ 87 4.4.5 Requirements for domain classification.......................................................... 88 4.4.6 General problems in compilation of domain classifications ............................ 89 4.4.7 Principles of compilation of the TermFactory core domain classification........ 90 4.4.8 User extensions to the core domain classification.......................................... 97 4.5 DATA CATEGORY CLASSIFICATION ...................................................................... 99 4.5.1 The concept and function of data categories.................................................. 99 4.5.2 Typical mismatches between data categories.............................................. 100 4.5.3 Mapping between formally different data categories .................................... 102 4.5.4 Mapping between structurally different data categories ............................... 103 4.5.5 ISO 12620 data category classification ........................................................ 104 4.5.6 Linguistic classification of data categories.................................................... 106 4.5.7 Mapping of other data category classifications onto linguistic classification 114 5 DESCRIPTIVE METADATA AND ITS ROLE IN QUALITY ASSURANCE 116 5.1 DOCUMENTATION OF SOURCES ........................................................................ 116 5.1.1 Types of data related to the documentation of sources................................ 117 5.1.2 Basic requirements for source references.................................................... 118 5.1.3 Additional elements of source references .................................................... 119 5.1.4 Basic requirements for bibliographic records ............................................... 121 5.1.5 Documentation of private sources and contributors’ expertise ..................... 123 5.1.6 Advanced support for source management.................................................. 124 5.2 ADMINISTRATIVE DATA..................................................................................... 126 5.2.1 Definition of administrative data ................................................................... 126 5.2.2 The potential multifunctionality of administrative data categories................. 126 5.2.3 Administrative data categories in ISO 12620 ............................................... 127 5.2.4 Classification of administrative data ............................................................. 128 4 6 CONCLUSION..............................................................................................135 BIBLIOGRAPHY.....................................................................................................139 APPENDIX 1 CLASSIFICATION OF TERMINOLOGICAL DATA.......................147 APPENDIX 1.1 LINGUISTIC CLASSIFICATION OF TERMINOLOGICAL DATA........................147 APPENDIX 1.2 MAPPING OF ISO 12620:1999 DATA CATEGORIES ONTO LINGUISTIC CLASSIFICATION OF TERMINOLOGICAL DATA.........................................149 APPENDIX 2 CLASSIFICATION OF ADMINISTRATIVE DATA .........................159 APPENDIX 3 DOMAIN CLASSIFICATION..........................................................160 APPENDIX 3.1 DOMAIN CLASSIFICATION IN FINNISH ...................................................160 APPENDIX 3.2 DOMAIN CLASSIFICATION IN ENGLISH..................................................177 APPENDIX 3.3 DOMAIN CLASSIFICATION IN RUSSIAN..................................................194 APPENDIX 3.4 DOMAIN CLASSIFICATION IN GERMAN..................................................211 APPENDIX 3.5 PRIMARY REFERENCE SOURCES OF TERMFACTORY DOMAIN CLASSIFICATION....................................................................228 APPENDIX