Investigating Multilingual, Multi-Script Support in Lucene/Solr Library Applications
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Bibliographic Access to Non-Roman Scripts in Library Opacs: a Study of Selected Arl Academic Libraries in the United States
BIBLIOGRAPHIC ACCESS TO NON-ROMAN SCRIPTS IN LIBRARY OPACS: A STUDY OF SELECTED ARL ACADEMIC LIBRARIES IN THE UNITED STATES By Ali Kamal Shaker BA—Cairo University-Egypt, 1995 MLIS—University of Wisconsin-Milwaukee, 2000 Submitted to the Graduate Faculty of School of Information Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 2002 ii BIBLIOGRAPHIC ACCESS TO NON-ROMAN SCRIPTS IN LIBRARY OPACS A STUDY OF SELECTED ARL ACADEMIC LIBRARIES IN THE UNITED STATES Ali K. Shaker, PhD University of Pittsburgh, 2002 With the increasing availability of non-Roman script materials in academic/research libraries, bibliographic access to vernacular characters in library OPACs becomes one of the primary means for users of these materials to access and use them efficiently. Though Romanization, as a bibliographic control tool, has been studied extensively during the past five decades, investigations of using original (vernacular) scripts remain inadequate. The purpose of this study was to trace the transition from Romanization-based to vernacular-based bibliographic access to non-Roman script materials. Two major developments contributing to this transition were the availability of records with script characters from bibliographic utilities, and the development of a universal character set, the Unicode standard. The main data collection instrument was a self-administered mail questionnaire sent to a purposive sample of academic library members in the Association of Research Libraries with sizeable non-Roman script collections. Another data collection technique utilized was document/Web site analysis of bibliographic utilities and library automation vendors. Forty five questionnaires were obtained, which represented 65% of an actual population of 69 libraries. -
Association for Library Collections and Technical Services Task Force on Non-English Access
Association for Library Collections and Technical Services Task Force on Non-English Access Complete Report September 18, 2006 Chair: Beth Picknally Camden Members: Joan Aliprand Diana M. Brooking Karen Coyle Ann Della Porta Shi Deng William J. Kopycki Heidi G. Lerner Kristin Lindlan Sally H. McCallum David N. Nelson Glenn Patton Karen Smith-Yoshimura Table of Contents I. Executive summary ......................................................................................................... 3 II. Recommendations ......................................................................................................... 5 III. Terminology.................................................................................................................. 8 IV. Overview of the current situation ............................................................................... 10 Appendix A: Charge ......................................................................................................... 17 Appendices B-L: Reports .................................................................................................. 19 Appendices B-E: ALCTS Sections, Committees and Task Forces .................................. 19 Appendix B: Machine-Readable Bibliographic Information (MARBI) Committee .... 19 MARC Proposals, Reports, and Discussion Papers Related to Character Sets and Use of Unicode in MARC 21.................................................................................... 22 Appendix C: Committee on Cataloging: Description and Access -
Planning for Unicode in Libraries: the LC Perspective
Library of Congress Planning for Unicode in Libraries: the LC Perspective SLA Annual Conference 2005 Ann Della Porta Integrated Systems Office Library of Congress Good morning. Thank you for inviting me to speak with you today. This is my first time at SLA and I’m really pleased to be here. I’d also like to thank Foster Zhang for his advice in developing a presentation for this session. What I’d like to do today is describe what LC has been doing to plan for conversion to Unicode and share with you some the issues we’ve identified in testing and implementing Unicode-compliant systems. I actually have some answers and suggestions, but I also have some questions that I think need to be considered by librarians. 7/19/2005 1 Library of Congress Agenda Topics z Background on Unicode z Implementing Unicode in the LC ILS z Planning for MARC 21 in the Unicode environment z Cataloging policy & the bibliographic utilities z OPAC users & Unicode This is my agenda for today: First, a bit of background on Unicode, with a focus on libraries and their users; then, LC’s planning for Unicode conversion in our integrated library system and an update on the MARC standard and Unicode; I’ll discuss cataloging policy issues and planning with the bibliographic utilities for the Unicode environment; and I have a few words about OPAC users and plans to support them. 7/19/2005 2 Library of Congress Background on Unicode z Unicode Mini-tutorial – Unicode “Lite” » What is Unicode? » Pre-Unicode Environment » Unicode Environment » Basic Design Features and Syntax » MARC-8 and UTF-8 » Misconceptions Before I talk about what we’re planning at LC, I’d like to talk about the Unicode standard as it relates to libraries so that we have a common language. -
Task Force on Non-English Access: Report
Association for Library Collections and Technical Services Task Force on Non-English Access Report September 18, 2006 Revised March 16, 2007 Chair Beth Picknally Camden Members Joan Aliprand Diana M. Brooking Karen Coyle Ann Della Porta Shi Deng William J. Kopycki Heidi G. Lerner Kristin Lindlan Sally H. McCallum David N. Nelson Glenn Patton Karen Smith-Yoshimura Table of Contents I. Executive Summary 4 II. Recommendations 4 Additional Recommendations 7 III. Terminology 7 Definition of Terms 7 Examples of Terms 7 IV. Overview of the Current Situation 8 A. Cataloging Practices 8 B. MARC 21 and Unicode 8 C. Non-English Support In Library Systems 8 D. Non-Roman Scripts in the Bibliographic Utilities 9 E. Non-Roman Scripts in Authority Files 10 F. Staffing and Economic Issues 10 G. Non-Roman collections 10 H. Unicode and Libraries 11 I. Conclusion 11 Appendix A: Charge 13 Association for Library Collections and Technical Services ALCTS Executive Committee, Task Force on Non-English Access 13 Appendix B: Machine-Readable Bibliographic Information (MARBI) Committee 14 MARC Proposals, Reports, and Discussion Papers Related to Character Sets and Use of Unicode in MARC 21 15 Appendix C: Committee on Cataloging: Description and Access (CC:DA) 17 CC:DA and JSC Actions Relating to Non-English Resources and Non-Roman Scripts 17 Appendix D: ALCTS Committee on Cataloging: Asian and African Material (CC:AAM) 23 Reports of Activities 23 Appendix E: Other ALCTS groups (in alphabetical order) 26 ALCTS Acquisitions Section (AS) 26 ALCTS Cataloging & Classification