50 The Network Digital Resources

international relations from the University of databases, to over 3,000 titles of printed Ljubljana – began to meet regularly to reflect books and other materials and is also a on the academic landscape in Slovenia and lively social place, where students, scholars, the neighbouring countries with regard to practitioners and interested public from Digital studying the East Asian region. They agreed the region meet. that there were several study programmes The EARL is yet another embodiment of the devoted to in the region, however, conviction that Humanities and Social Sciences they have comparatively speaking relatively must go hand in hand to understand what Resources modest support in literature and in primary is going on around us and how we got there. sources. To access these (re)sources, students Interdisciplinarity matters. One might have and experts need to travel to East Asia. So the the most detailed knowledge about politics group started wondering: if all these resources or social fabric of any corner in this world; yet, are difficult to reach, why not bring them if one does not understand the culture and closer to us? speak the language people speak in one part With technological advances, the idea how of the (East Asian) region or the other, one shall to make ‘smallness’ irrelevant seemed doable. never have a full understanding of it. In modern After all, digital databases already cover social sciences and humanities, it is safe to most of secondary sources, and digitalisation say that this kind of thinking has not yet been of primary sources is progressing rapidly. internalised. The EARL is built on the premise Hence, with the support of experts on digital that the 'wall' between social sciences and databases, a concept of a regional hub for humanities is not really high if we use the right resources devoted to East Asia started to take ladder, the one which is built from the wealth shape. By the end of 2015, work on the EARL of interdisciplinary knowledge and experience began. Its mission has been embraced and of scholars coming out from the two branches supported by the University of Ljubljana’s of science. Faculty of Arts and Faculty of Social Sciences. In conclusion, let us go back to the Both faculties are large teaching and research reflection on size. We see the EARL not in terms institutions (https://www.fdv.uni-lj.si/en/ of the small, landlocked room, but, in terms home; and http://www.ff.uni-lj.si/an). what it really offers. It is a large port, with On 17 May 2016, the Deans of the two access to the open sea of resources, going Faculties signed an agreement to establish into millions of books, articles and historical the East Asia Resource Library – EARL documents. In international relations, Open East Asia Resource Library in Slovenia: (https://www.fdv.uni-lj.si/en/library/earl). Sea signifies a space with no borders. In our The EARL was structured as follows. world of science and teaching, knowledge Open a matchbox and access global Symbolically, the EARL signifed a region. It is our ocean. This is why EARL’s motto is: provided designated spaces for each of the knowledge knows no borders. It is in this knowledge of the region participating institutions, which are also vast space that people with knowledge meet, called ‘corners’. Officially, the ‘corners’ are compare notes, and discuss. With this in named as follows (in alphabethical order): mind the EARL hosts presentations, lectures, Zlatko Šabič and Mirjam Kotar Corner Reading Beijing, galery, cultural events, and also academic, Corner, Corner, and Resource professional and unoficial meetings. A very Centre for Chinese Studies. Each section is special value of the EARL is the unreserved n September 2017 a delegation from a We, too, were taken a bit by surprise by organised differently. Embassies from Korea commitment of our partners to provide the Chinese institution came to visit the location such an observation, but we also took a lesson and Japan in Slovenia serve as facilitators EARL with the very best the region as a whole Iof the East Asia Resource Library (EARL), from that impression. Since then, we never fail of information and contacts with the Japan can offer. In trying times like the ones we live which is hosted by the University of Ljubljana’s to explain to our guests that the word ‘space’ Foundation and the Korea Foundation that in today, access to resources and a possibility Faculty of Social Sciences, in Slovenia. This is has different meanings in different contexts. provide the financial support for electronic to compare them is the only way forward a modern, spacious building, home to about In Slovenia, which given its geographical databases, books and other resources. The to bridge irrational political differences and 3,000 students, researchers and faculty and determinants belongs to a group of smaller EARL also collaborates extensively with the discuss the region with only one vision in a venue for countless events at all levels. The countries on this planet, physical size does not Capital Library of China, with the Taiwan mind: to assure and sustain lasting peace. delegation could hardly hide their surprise matter much. There are many determinants Resource Centre for Chinese Studies and when they realised that the EARL was made that relativise it. What follows is a story that various nonprofit institutions from East Asia. Zlatko Šabič Director General of the up of three rooms which barely exceed 200 proves this. All these institutions contribute to the unique East Asia Resource Library (EARL). square meters. They thought that the entire In November 2015, a group of four scholars concentration of knowledge about the region. Mirjam Kotar Head of the Central Social Faculty of Social Sciences was the ‘Library’. – synologists, japonologists and experts in EARL offers access to several East Asian Library, Chief Coordinator of the EARL.

as the recognized text. Furthermore, users can download a high quality OCR-PDF of the Naval Kishore Press – digital: facsimile from the project website where the text is also fully searchable in both scripts. From hidden treasure to open access The annotation tool implemented in DWork allows scholars worldwide to work collaboratively on a text or text corpus Nicole Merkel-Hilf independent of place and time. Each annotation can be entered comfortably via a web form, is provided with the name of its he Naval Kishore Press was established From the mid-19th century author and can be reliably referenced and in the north Indian city of Lakhnau onwards wood pulp paper was used quoted by being assigned a DOI. Revisions Tin 1858 by Munshi Naval Kishore for printing which tends to be acidic are saved as new versions, while earlier (1836-1895). In the following decades it grew and therefore paper deterioration is versions remain still visible and can be to one of India’s most important publishing a problem for the printed part of the accessed through the revision history. houses. During Naval Kishore’s lifetime the collection. For reasons of preservation press published around 5,000 titles covering the Naval Kishore Press – digital Both resources can be accessed literature in Hindi, Urdu, Arabic, Persian and project was initiated by the SAI library on CrossAsia: Sanskrit on subjects as diverse as religion, and Heidelberg University Library.1 https://themen.crossasia.org education, medicine, school-books, popular Within this project, selected Hindi editions of Sanskrit literature, and much more. and Sanskrit titles in Devanagari script from the to train a recurrent neural network to get a data Nicole Merkel-Hilf Chief coordinator The library of the South Asia Institute (SAI) at Naval Kishore Press collection are digitized, but model to automatically transcribe more texts “South Asia” FID Asien, SAI Library. Heidelberg University holds a representative the primary aim of Naval Kishore Press – digital from the Naval Kishore Press collection. With cross section of the Naval Kishore Press’ is to offer scholars more than a digitized image an error rate of 5,59% on a random test set the publications with 1,400 titles in print and facsimile. The goal is to produce machine- results are very promising and we are using the around 700 titles on microfilm. readable texts that can be further edited online model now on the digitized Hindi and Sanskrit by using digital editing techniques. texts of the Naval Kishore Press collection. Notes In order to make this treasure more Suitable OCR software especially for South For the web presentation of the digitized visible for scholars the Naval Kishore Press Asian scripts has long been unavailable due images and the OCRed full-texts created with 1 Nav al Kishore Press – digital is Bibliography has been set up by using the open to the complexity of the writing systems and Transkribus the software ‘DWork – Heidelberg part of a larger, three-year project source software VuFind. The bibliography is has turned out often to be unsuitable for mass Digitization Workflow’ is used, an in-house ‘Fachinformationsdienst Asien’ intended as a provenance database and aims digitization projects. For the Naval Kishore development by Heidelberg University (FID Asien), funded by the Deutsche to provide access to bibliographic records as Press – digital project two text recognition Library. It provides a variety of functions for Forschungsgemeinschaft (DFG) until well as digitized online editions of works issued methods have been used – the OCR software the use of digital copies, such as thumbnail the end of 2018. The FID Asien project by the Naval Kishore Press that are distributed for Hindi and Sanskrit developed by ind.senz overview, zooming in and out, full text search, is cooperatively carried out by the State in libraries worldwide – and not only to the and, more recently, a data model trained by and various navigation features as well as Library in Berlin, Heidelberg University 2 Library and the South Asia Institute. SAI library collection. Currently we are Transkribus. For the training of the model 200 components for annotations. The web portal CrossAsia is used as the enriching the bibliography with 1,200 title pages of a so-called ‘ground truth’ transcription Words or phrases from the Hindi and central access point to the project results records from the Bodleian Library in Oxford. was produced, i.e. an accurate representation Sanskrit texts can be searched in Devanagari and for scientific information in Asian The bibliography will then contain more than of the text on the image facsimile. The ground script or in Latin transliteration and the results studies (https://crossasia.org/en). 3,500 entries from eight different libraries. truth transcription and the images are then used are highlighted in the image facsimile as well 2 h ttps://transkribus.eu/Transkribus