ISSN 0204–2061. KNYGOTYRA. 2008. 51

DIGITIZATION IS NOT ONLY MAKING IMAGES: MANUSCRIPT STUDIES AND DIGITAL PROCESSING OF MANUSCRIPTS

ZDENĚK UHLÍŘ

National Library of the Czech Republic Klementinum 190, 110 00 Praha 1 E-mail: [email protected]

Author deals with the link between the digital processing of historical documents, especially manuscripts and the manuscript studies, codicology and bibliology and cultural history as well. The greatest part of the paper applies to the case study about the Manuscriptorium which is provided by the National Library of the Czech Republic. Ke y wo rd s : , manuscript studies, National Library of the Czech Republic.

Introduction It is a question whether digitized historical documents are the preservation aids because Digitization of historical documents and/or they are also promotion aids. Such a “dig- holdings has been in progress for approxi- ital promotion” of historical documents ad- mately fifteen or twenty years, since about dresses not to specialists but more likely to the end of eighties or the beginning of nine- a general public that is neither interested in ties of the twentieth century. At the earliest, historical nor similar studies but in a mere during the first half of nineties digitization information about the past. Briefly the gen- meant creating a surrogate, an alternative eral public does not want to “consult” his- carrier, something like “better microfilm”. torical document, to study its internal and/ Goal of such a digitization was very simple, or external features but it prefers to see a namely to be a preservation aid. Up to this historical document as a thing, as a physi- day for some people digitization still counts cal object, that illustrates the more or less to preservation. These people do not- un known past instead. General public is not derstand the spirit of the information, the interested in a sophisticated difference be- knowledge of/and society and do not see tween “the past” and “the history” (res gestae the challenge of the information and com- and historia rerum gestarum using Hegelian munication technologies that comes with it. words), so it is not interested in digital cop-

148 ies; a digital copy is for such a public sim- not experts in historical documents could ple caption, a sign for something else, i.e. be interested in that. Again public world- for exhibition of original document. Thus, wide could be and is interested in it. Thus, such an understanding of digitization goes “democratization” means global accessibil- in a vicious circle: the supposed objective ity independently on a social, cultural, and was to exclude originals from lending, on national environment but dependently on the other hand the factual result is a more education and/or erudition. And a global massive lending of originals. Thus, this pri- accessibility under another aspect means a meval understanding of digitization results free access; not necessarily free of charge but in an internal discrepancy. absolutely free of any discriminatory balk. A wide expansion of the Internet in Thus, digitization means a global dis- the mid nineties was a great challenge for semination of historical documents and/ digitization of historical documents and/or or holdings, a challenge for inter-cultural holdings. At the earliest, it was not accepted studies in general and an impulse for man- well by the academic community because uscript studies in particular. Consequently, of suspicion of commercialization, at least digitization is not simply making images; in most European countries; sometimes it it is and must be much more. Perhaps it is was understood negatively as an “American not seen when digitizing small number of invention”. In the first years of its spread in manuscripts, several units or at the utmost Europe, Internet was accepted positively several tens. It is however seen after digi- mostly by professional communities and tizing critical mass of manuscripts, i.e. per- some memory institutions (especially by haps a few hundred. Then, simple images, libraries and librarians in the Czechlands). i.e. image sequences are unmanageable for A keyword of that time was not “preserva- the end user as well as for the information tion” yet but “democratization”. It had to system administrator. Compound digital mean that keepers, i.e. memory institution document/s that make relations between professionals ought not to make borders for data (digital images) and metadata (descrip- any interested person, no matter if experts tive catalogue record, structural and techni- or the general public. The Internet presenta- cal metadata as well) must be created. And tion of a historical document ought to be it is only the first step, only the simplest accessible for everybody and particularly form of the compound digital document, for “vulgarians” (hommes de la rue, gemeine of course. After overstepping the critical Menschen). Of course, it was another illu- mass of compound digitized documents sion because “vulgarians” are interested in there must be created a sophisticated in- other things than in digital copies of histori- formation system that enables orientation, cal documents, medieval manuscripts etc. e.g. showing both the whole database and On the other hand, intellectuals although partial collections, as well as navigation,

149 e.g. enabling not only standard full text historical methodology of collective, mass, search but also combined search using op- aggregate, wholesale phenomena against erators, expert search using filters, search the individual ones; and thirdly in relation implementing graphical variants etc. Fur- to the so-called pragmatical edition in op- ther, more complex form of the compound position to the traditional critical and/or digital document may contain also full texts semi-critical edition. Of course, such a para­ that should be correlated according to the digm shift is quite a long run rather than appropriate parts or passages to the digital a fast and brief occurrance, it is “scientific images. It may contain several full texts (e.g. revolution” in Kuhnean words. Therefore edition of the original text and its transla- digitization of historical documents and/or tion) that should be correlated to each oth- holdings is currently an activity in advance er and both full texts to the digital images. operations of which are both routine and It may contain also audio documents that under research and development at present. should be correlated to the full texts and Traditional ways of representation histori- digital images as well etc. Not only editions cal documents and/or holdings consisted of of original historical, i.e. primary docu- complex and sophisticated descriptions uti- ments and their translations but also sec- lizing terminology with very hard semantic ondary documents, i.e. documents about reduction that passes by natural language/s the primary documents may be integrated and that not always distinguishes between into the compound digital documents. And things or objects on one side and ideas, no- there are possibilities that can be multiplied tions, concepts etc. on the other. Paradoxi- practically ad infinitum. It is clear that such cally this traditional terminology is based a digitization of historical documents, or on vernacular, i.e. substantially natural let us say manuscripts, means a complex languages and simultaneously deprecates digital processing of manuscripts and that its own basis, such as translation between it is very important for manuscript studies, linguistic utterances and consequently na- codicology, bibliography etc. tional discourses is sometimes very diffi- cult. Digitization by making more or less Theoretical Suppositions accurate copies of historical documents Accordingly, digital processing of manu- enables to overstep some of these crucial scripts is about a paradigm shift. It can be problems. Again, digitization facilitates a understood in triple way: firstly in relation goal-directed navigation and control and to the historical auxiliary sciences (histor- management of a huge amount of data by ische Hilfswissenschaften), respectively in preparing electronic full texts. It is impos- relation to the so-called quantitative codi- sible through traditional ways. cology in opposite to the traditional arche- Some technical conditions are necessary ology of book; secondly in relation to the for the paradigm shift, namely dividing

150 data from software, standardization of data, and for achieving the same result as for the and interoperability of tools and systems. content. Thus, for data creator a principle of Firstly, dividing data from software, i.e. mu- standardization means on one hand a flex- tual independence of data and software is a ibility in the use of software tools, editors, substantial condition that enables the other processors etc., on the other hand a variety two ones. Digital, i.e. Internet environment of choices in the information depth; and is principally heterogeneous, not homoge- when using markup languages represents neous, the tools and systems are various, in- an inner structure of document, the princi- deed different, so that sharing data among ple of standardization means to use various tools and systems could be a problem if manners of concrete application of markup the data were software, i.e. tool and/or sys- and consequently to represent various in- tem dependent. Such software, tools and/ formation and information levels from the or systems could not import and/or export same primary evidence. Thus, standardiza- data from one another and they would not tion of data is the middle of how to make interoperate. Problems could arise also by data readable for machines and generally using communication protocols because of understandable for humans. a need to implement profiles according to Thirdly, interoperability of tools and the regular, i.e. open, non proprietary, not systems consists of inter-tool and/or inter- software dependent data standards. Thus, system communication, i.e. in importing dividing data from software is conditio sine and/or exporting data, in data mining, qua non for the digital representation of harvesting etc. It is a principal condition historical documents and/or holdings. for working complex modular systems, Secondly, standardization of data is a for cooperation of their inner components necessary condition for applying manu- and for cooperation with external tools as script studies, codicology and bibliography well. Is particularly important to guarantee in the digital, i.e. Internet environment. Standardization means to use open, not Internet: ; OpenOf- proprietary standards for data prepara- fice in combination with SourceForge, access through Internet: and ; GNU Emacs, access through In- exchange. In other words, standardization ternet: ; NoteTabLight, access through Internet: . See also spe- cial software tools, e.g. MEdit, access through 1 See various XML editors for creating descrip- Internet: ; MTool, access jEdit, access through Internet: ; Peter´s XML editor, access through rium.com/Site/ENG/mtool_eng.asp>.

151 interoperability between internal tools of or philology generally. Since the theoreti- the system and the external tools because cal transformations that are influenced by it oversteps traditional approach to the an application of information and commu- information system and/or digital library nication technologies concern the deepest and comes close to a virtual research en- level of humanities in general and history vironment. In this case a virtual research and/or philology in particular, we can speak environment means a possibility to create in the matter of these theoretical transfor- and process a personal collection of di- mations about a paradigm shift that asserts gital documents (in all probability full text its rights just now. Relevant questions that editions of original historical, i.e. primary concern paradigm shift in humanities, his- documents) using special tools that enable tory, philology, library and archival science to get to information from point of view of are very complex but some of them can be e.g. computational linguistics etc. Or vir- set aside as fundamental for digital libraries tual research environment means to create a presenting written cultural heritage [3; 9]. grid of resources among which one of them Firstly, a big problem is that codico- is the main and/or (relatively) independent logy, i.e. discipline that deals with manu- one and other are collateral and/or depend- script books, literary manuscripts and alike, ent. Thus, interoperability is the most ap- is understood almost solely as archeology parent condition for practical work with of book. Archeology of book means that historical documents and/or holdings. questioning and research are oriented to a physical condition of manuscript book (but and the same is valid also for the printed book Paradigm Shift: Ideas and consequently for bibliography that and Practice deals with this type of book material) and The practical work with historical docu- that questioning and research oriented to ments and/or holdings is not a work of the intellectual content and consequently technicians (or if you like computer scien- to the cultural history is more or less ex- tists) and librarians (or if you like informa- tinguished. Thus, the traditional codicol- tion scientists) but it is more likely a work ogy and/or bibliography which is mainly of scholars and researchers in humanities, archeology of book is contradictory to the especially historians and philologists. Of idea of digital representation of written course, there are various historical and phil- cultural heritage which is based on an ef- ological specializations and sub-disciplines, fort to research a content not a container. on the other hand theoretical and paradig- Both conceptions, i.e. archeology of book matic foundations of all branches of histor- and researching a content not a container ical and/or philological scholarship are the as well are equally connected with idea of same so that I can speak about history and/ evaluation of historical sources, only from

152 different points of view: on one hand physi- hand it is very important that only the item cality of the printed environment enables level has a real and simultaneously actual distinguishing original and its copy – and i.e. concrete existence while the other lev- so physicality is for archeology of book an els (manifestation, expression, work) are as essential notion; on the other hand virtu- for their existence in some respect real and ality of the digital environment does not virtual and perhaps abstract. Thus, a conse- enable distinguishing original and its copy quence of cataloguing and bibliographical but it very well enables comparing differ- work concerning written cultural heritage ent contents – and so virtuality is for digital is crucial. humanities a basic notion [6; 13]. Thirdly, there is a concept of a fluid text Secondly, there is a new conception of that is contradictory to the concepts of ar- content types and/or content levels as ar- chetype, “Urtext”, and so-called best word- ticulated in the IFLA document Functional ing as well. Such a contradiction is very Requirements for Bibliographic Records important for historians and/or philologists (shortly called FRBR) [5]. It describes a because the idea of critical edition is based bibliographic conceptual framework that on these notions. While archetype, “Ur- negates the traditional one that is based text”, or so-called best wording means that on subordination to the idea of printed there is a strongly given text that can be, edition. FRBR´s conceptual framework is or indeed has to be the base for the critical much more complex and articulates four edition, fluid text means that there isn’t any gradual levels characterizing the whole possibility like this. In other words, while existence of the literary work, i.e. an archetype, “Urtext”, or so-called best word- item (a physical object-individual book, ing means that their wording is transpar- i.e. a purely actual object), a manifestation ent to the ideal sense, fluid text means that (a printed edition-bibliographic unit, i.e. a there is no possibility like this because the multiplicity of items), an expression (a ver- proper reality is the massiveness of the indi- sion, translation, adaptation, mutation, i.e. vidual records that combine the fluid text. concrete wording), and work (an artifact as A combination of qualitative and quanti- an aspect of a personality, i.e. a purely vir- tative analysis is necessary so that we can tual object). In the consequence of a pure understand reality of texts and reality of life virtuality of work some authors acknowl- as well. Computational linguistic tools are edge that the work level has only a theoreti- more appropriate for solving such problems cal importance and that the level of work than traditional philological methods and and that of expression are perceived and so the way for and/or digital understood simultaneously as only one lev- philology is opened [10]. el called “worxperssion”. However, it is too Fourthly, the idea of versioning has been sophisticated in this context. On the other popular during last roughly twenty years. It

153 means that preparing critical editions has the National library of the Czech Republic no sense because each individual wording, (access through Internet: ) from the content part nad AIP Beroun individual glossed adaptation has its own Ltd. (access through Internet URL: ) from the technical wording that is presumably the right in a part. National library of the Czech Republic contradiction to the other wordings. The started its mass digitization in 1995, respec- idea of versioning is one of the basic ideas tively 1996 and in 1997 and 1998 created of the so-called new philology. Accords and standard DOBM [8], which was adopted as differences between various/all versions are a UNESCO recommendation in 1999. Na- important. A research in the area of version- tional library of the Czech Republic was in ing is concurrently a research in the area of 1999–2001 one of the full partners of the cultural history, in the area of a content in European project MASTER (Manuscript a contradiction to the area of a container. It Access through Standard for Electronic enables us to see all the linguistic versions at Records) [23; 20; 21; 22], whose goal was the same level, i.e. coordinated mutations to make a TEI (; rather than subordinated translations. This access through Internet: ) compatible standard both European integration and globaliza- and in 2002 National library of the Czech tion. Thus, the computational linguistics is Republic crated the MASTER+ standard again against the traditional one so that a (msnkpaip.dtd), a standard (access through place for digital history and/or philology is Internet: ) the goal of opened (compare [1; 19]). which was to enable the digital document, i.e. to connect descriptive catalogue record Manuscriptorium Digital and images, respective sequences of images Library: a Case Study that represent copy of the original histori- What I said hitherto is a mere theoretical cal document. In 2003 opened catalogue of conception (more concisely see [24]) that historical holdings arose and in early 2004 must be only realized practically. There operation of the Manuscriptorium digital are only a very few attempts in the practi- library was introduced. In December 2007 cal realization of the virtual research envi- work on the European project ENRICH ronment concerning historical documents (European Networking Resources and In- and/or holdings. One of them – perhaps formation concerning Cultural Heritage; the most sophisticated and the largest Access through Internet: ; ) [18] started www.manuscriptorium.com>) provided by the goal of which is to integrate resources

154 that provide historical documents and es- not creating products that are once and for pecially manuscripts as much as possible. all finished but it is a continual process that Thus, Manuscriptorium digital library is is never done. Last but not least the Man- based on a wide and rich experience, so it uscriptorium database of the descriptive is able to offer and/or provide a relatively records offers several options: firstly, data- advanced service. base of the base records can contain various Heart of the Manuscriptorium digital language mutations; secondly, base records library is a catalogue of base records in a can be alternated by special records of lesser format according to the MASTER stand- weight expressive in specific points of view, ard. When catalogue records made by other e.g. codicology, art history, musicology creators or offered by other providers are etc.; and thirdly each of existing catalogue imported into the Manuscriptorium base records can be improved. Thus, the Manu- records database they are converted into a scriptorium database of base records is very format according to the MASTER stand- flexible. Moreover, its search system allows ard if they are originally created in another searching with character tolerance, with the format (usually MARC21, UNIMARC, use of graphical variants with use various Dublin Core, MODS and other). MAS- operators or phrases etc. It is robust enough TER does not prescribe what information to fulfill well the needs of the end user. depth whatever descriptive catalogue record At present the most important part of should have; it depends on each catalogu- the Manuscriptorium digital library are im- er’s resolution. It can be a problem from the ages, i.e. digital copies of original historical traditional manuscript studies, codicology documents although we see that digitiza- or bibliography point of view of because tion is not only and perhaps also not chiefly the database content can be unbalanced. making images. Of course, the images are On the other hand delivery of descriptive scanned with the highest sensible resolution catalogue records by individual providers and in the archival quality. On the other or partners can be regularized according to hand, Manuscriptorium digital library is the concrete feasibility and capability. The not an archival system; it is a presentation biggest advantage of the MASTER stand- system, so that images of the excellent, i.e. ard is a very detailed fragmentation of data archival quality are not provided via the and consequently a big search consistency. Manuscriptorium. All the same images of When possibility of subsequent replace- more quality levels are provided, typically ment of an existing descriptive catalogue gallery thumbnails, previews, low/Internet record for the better one is ensured (which quality, normal quality and black and white in the case of the Manuscriptorium data- optimization for more pointed contrast for base) then the difficulty is only in theory better reading if needed. Five quality levels but not in practice because cataloguing is are typically provided. However, it is not

155 obligatory for all potential partners, it de- query – title: Homiliarium quod dicitur pends on partner’s capacity and ability. We Opatovicense; repository: Národní knihovna can say that three quality levels are desir- České republiky; shelf mark: III F 6), some able, i.e. gallery thumbnails, previews and of them are retro-conversions of traditional one sequence of images of better quality. printed critical editions (see e.g. http://www. Of course, there are still more possibili- manuscriptorium.com – query – title: Codex ties concerning quality levels of images but gigas; reporitory: Kungl. Biblioteket – Sveri­ they can be widely used in future rather ges nationalbiblioteket; shelf mark: A 148), than nowadays; it comes down to such a some of them are translations of original (at kind of scanning that would enable to see present only Latin) texts (at present only into watermarks, to read palimpsests, text under Czech) (see e.g. http://www.manuscriptori- blotch etc. The question is if this kind of um.com – query – title: Codex gigas; repori- image scanning should be a standard offer tory: Kungl. Biblioteket – Sveriges national- of the Manuscriptorium digital library. Ac- biblioteket; shelf mark: A 148), some of them cording to my opinion it may not be and are mere hand made transcriptions without it will be better to provide such qualities any scholarly apparatus (see e.g. http:// in some specific collateral resource; exploi- www.manuscriptorium.com – query – title: tation of such images is namely not com- Paměti Jednoty Bratrské z let 1530-1546; mon. repository: Národní knihovna České repub- Another part of the Manuscriptorium liky; shelf mark: XVII C 3), some of them content are full texts [17]. Full text data are made through scanning and OCR with are prepared according to the TEI stand- subsequent human control (see e.g. http:// ard, existing document type definitions are www.manuscriptorium.com – query – title: for prose, poetry and factual prose (access Článkové všeobecného sněmovního snešení; through Internet: ). At repository: Parlamentní knihovna; shelf present the full text content of Manuscripto- mark: F 5042). Such differences as for qual- rium consists of scholar editions, or should ity of full texts that are implemented within I say transcriptions of primary, i.e. original Manuscriptorium digital library can be seen historical documents; implementation of full as very inconsistent. On the other hand such texts of secondary documents, i.e. scholarly variability as for quality levels of full texts is papers etc. is in process of testing. Editions very practical and operative. To prepare edi- and/or transcriptions of primary, original tions of historical texts is in any case very dif- historical documents have various forms at ficult, paleographical and linguistical skills present and also in the future. Some of them are necessary. There are few people that are are digitally born pragmatical editions of in- able to prepare such editions more so when dividual manuscripts or versions of texts (see the skills concerning computing in humani- e.g. http://www.manuscriptorium.com – ties are not generally disseminated.

156 Nowadays the problem of digital full tion using XML, MusicXML [12] and MEI text editions is truly up-to-date and very (Music Encoding Initiative) that is com- important as well [14; 16]. To present only patible with TEI (Text Encoding Initia­ digital image copies of original historical tive) [11] were identified. Other potential documents is at any rate not enough – to standard for representing music notation is present digital texts of historical primary developed by the CMME (Corpus Mensu- documents is necessary for the conversion rabilis Musice Electronicum) [4]. It already of media which is nowadays one of main has been established that the MusicXML tasks of humanities. Theories concerning standard is a proprietary standard and so-called fluid text occur [2], conceptions moreover it is more likely a standard for the concerning versioning as a presentation of exchange format for representation of the multiple texts appearance [15], versioning music notation than for the archival one. software tools are created (see “v-machine”, Thus, although its software support is rela- access through Internet: ). The philological and historical work best solution to use it. The MEI standard rise on one hand and computing in human- is probably a better choice for the archival ities dissolve on the other. The new ways of format for representation of the music no- study of text transmission and manuscript tation but its software support is at present tradition with utilization of specific striking rather poor. On the other hand there is a character expand [7]. Accordingly, building hopeful factor, namely that community us- digital libraries is a big challenge for schol- ing the MEI is growing, so that the software arly edition work. The team of Manuscrip- support could be better in the future. The torium digital library collaborates with the same applies to the community using the team of the Old Czech Department of the CMME standard. It will take some time, Institute of the Czech Language in Prague of course, until the best practice for music (see http://www.ujc.cas.cz/oddeleni/index. editors will be ready. php?page=staroces>). This collaboration Also audio/music document/s can be brings great benefits for both parts. a part of the compound digital document A music notation is also a text while it is within the Manuscriptorium digital library. seen in semiotic sense, of course, such that From the technical side it is no problem for music editions are also full text editions in many years. On the other side it is a big a wider sense. Therefore the Manuscriptori- problem from the copyright point of view. um team is testing possibilities how to do it Although the problem with copyright does now. Unfortunately there is no best practice not concern composer/s (because the com- and no workflow available, so that various positions that are covered are already public ways must be examined. Two standards that domain), it concerns still performer/s. Thus, should well enable to represent music nota- implementation of audio/music documents

157 into the Manuscriptorium digital library is Internet nowadays. Through such personal- possible only by way of a trial if agreed by ized collections and/or documents a valuable performer/s. The audio-music performance content that would be close and inaccessible is an interpretation of the original source, in other cases can be made accessible. Col- certainly not the original source itself. On lective and concurrently individual and/or the other hand, music is a performance, not personal nature of research comes through a written source. Consequently the audio virtual collection and virtual document. It representation of music is fundamental for is fundamental that especially virtual collec- the end user together with the representa- tion can be created very flexibly, i.e. both by tion of the written expression of music. choosing and book-marking already exist- Thus, testing mutual correlation between ing documents and by choosing documents images and music editions, i.e. full texts from the growing database according to the and audio/music documents as well is an general query. Consequently virtual collec- important step in creating virtual research tion may not be static, it can be dynamic too environment from a more complex point and so virtual collection is a heuristic tool for of view. However, all questions and tasks of a wide application. While virtual collection is this activity will be solved only when simul- or can be a heuristic tool, virtual document taneously the problem of the copyright in is more likely a research result, product, the Internet is solved. publication in the traditional sense. Thus, There are some tasks that are between virtual collection and virtual document as the Manuscriptorium digital library and the well should be very hopeful and promising individual research. These tasks concern vir- outputs of the virtual research environment tual research environment in proper sense in a near future. because they connect Manuscriptorium as There are still other ideas of -develop a general resource with the research area of ment of the Manuscriptorium digital individual persons. Main representatives of library in long-term view. It concerns ex- these task results are virtual collections and ternal tools that can interoperate with the virtual documents that are in the phase of Manuscriptorium. Some of them are pre- preparing now and next year they will be pared for future testing now. Use of some tested. The goal of the virtual collection/s software tools based on the computational is to choose everybody’s own collection for linguistics is most interesting in this case. his/her individual research and such a virtual There are already some full texts contained collection can be made accessible simultane- in the Manuscriptorium and the full texts, ously for somebody else even for everybody especially editions of primary, original his- if the creator of the virtual collection wants. torical documents will grow increasingly; Personalizing tools and/or content creation full text editions were already identified as as well is an important task of developing a substantial part of the Manuscriptorium

158 digital library. Making full text editions Conclusion available is a high priority among end user Theoretical conception and practical expe- requirements. Therefore a further work rience of the National library of the Czech with the full text editions is also desirable. Republic during the methodical digitiza- So we may very well imagine an export of tion process concerning historical docu- full text edition/s from the Manuscripto- ments and/or holdings have proved that rium database and its/their import into a digitization is really not simply making special corpus of texts where a software tool based on the computational linguistics, e.g. images. Digitization of historical docu- a corpus manager can be applied to the ments and/or holdings means not only and newly created corpus of texts. As for corpus not firstly creating mere digital copies for manager is quite sophisticated tool both for preservation purpose but complex activity searching within texts and for displaying concerning presentation of cultural heri- results, in some cases e.g. also for clustering tage and representation of historical sources. the texts it is very convenient for heuristic Therefore images must be accompanied by purposes of cultural and literary historian other types of documents, i.e. by full texts, as well as of other specialists. Using such audio/music documents, multimodal/mul- tools and also other external tools is an out- timedia documents etc. Digitization of his- look of several years. torical documents and/or holdings leads to And ultimately a futurological vision: the paradigm shift this way and it is one some computer scientists believe that OCR way to the information and knowledge so- can also be applied to manuscripts... ciety.

REFERENCES 1. BROM, Vlastimil. Der deutsche Dalimil: monthly Newsletter of Research Library Issues Untersuchungen zur gereimten deutschen Über- and Actions. Issue 189, December 1996 [ac- sezung der alttschechischen Dalimil-Chronik. cessed 6 June 2008]. Access through Internet: Brno: Masarykova univerzita, 2006. 281 p. . ISBN 80-210-4211-7; 4. CMME: dynamic early music editions. 2. BRYANT, John. The Fluid Text: a Theory Access through Internet: . Ann Argot: University of Michigan Press, 2002. 5. Functional Requirements for Bibliographic 198 p. ISBN 0472068156. Records: Final Report. München: K.G.Saur, 3. CHODOROW, Stanley. The Medieval 1998. 136 p. ISBN 3-598-11382-X [accessed Future of Intellectual Culture: Scholars and Li- 6 June 2008]. Access through Internet: .

159 6. GIESECKE, Michael. Der Buchdruck in 14. REHBEIN, Malte. Editionen als Soft- der frühen Neuzeit: Eine historische Fallstudie über wareproblem: die “dynamische Textedition” [ac- die Durchsetzung neuer Informations- und Kom- cessed 4 Juny 2008]. Access through Internet: munikationstechnologien. Frankfurt am Main: . 7. KALISZUK, Jerzy. Mędrcy ze Wschodu: 15. REIMAN, Donald H. “Versioning”: legenda i kult Trzech Króli w średniowiecznej The Presentation of Multiple Texts. In REI- Polsce. Warszawa: Efekt, 2005. 332 p. ISBN MAN, Donald H. Romantic Texts and Contexts. 8387338249. Columbia: University of Missouri Press, 1987, 8. KNOLL, Adolf; MAYER, Tomáš; PSOH­ p. 167–180. ISBN 0826206492. LAVEC, Stanislav; VOMLEL, Jan. Digitization 16. ROBINSON, Peter. Current issues in of Rare Library Materials. Storage of and Access to making digital editions of medieval texts – or, Data: The Solution for the Compound - Docu do electronic scholarly editions have a future? ment, Manuscripts and Old Printed Books Digital Medievalist, 1.1 (Spring 2005). ISSN [CD-ROM]. Praha: Národní knihovna České 1715-0736 [accessed 4 Juny 2008]. Access republiky, 1997. through Internet: http://www.digitalmedieval- 9. MICHELUCCI, Pascal; MARTEIN- ist.org/article.cfm?RecID=6>. SON, Peter: Paradigma Lost? Electronic Pub- 17. UHLÍŘ, Zdeněk. Manuscriptorium: lishing and the Renewal of Research. In: CHWP: evropská digitální knihovna rukopisů s plnými A 12, Publisher April 1998. [Jointly published texty. In Problematika historických a vzácných with TEXT Technology, 8.1 (1998), Wright State knižních fondů Čech, Moravy a Slezska, 2007. University.]. [Accessed 6 June 2008]. Access Sborník z 16. odborné konference, Olomouc, through Internet: . knihovna v Olomouci; Brno: Sdružení knihoven 10. MOELLER, Bernd-Stackmann, Karl. ČR, 2008, p. 103–108. ISBN 978-80-7053-276- Städtische Predigt in der Frühzeit der Reforma- 8 (VKOL); 978-80-86249-47-6 (SDRUK). tion: Eine Untersuchunng deutscher Flug- schriften der Jahre 1525 bis 1529. Göttingen: 18. UHLÍŘ, Zdeněk: Manuscriptorium na Vandenhoeck-Rupprecht, 1996. 383 p. ISBN cestě k evropské digitální knihovně. In Knihovny 3-525-82436-X. současnosti 2007. Brno: Sdružení knihoven ČR, 11. The Music Encoding Initiative(MEI), ac- 2007, p. 136–144. ISBN 978-80-86249-44-5. cess through Internet: . latinského překladu Kroniky tak řečeného Dal- 12. MusicXML Definition. Access through In- imila. Knihovna, 16, 2005, Nr. 2, p. 137–169. ternet: . ISSN 1801-3252. 13. O´DONNEL, James J. Avatars of the 20. UHLÍŘ, Zdeněk: Projekt MASTER Word: From Papyrus to Cyberspace. Cambridge, a jeho aplikace v NK ČR. In CASLIN 2002. Mass.; London: Harvard University Press, 2000. Ochrana a sprístupňovanie dokumentov: nové 210 p. ISBN 0-674-00194-X. trendy. Grand hotel Permon, Podbanské 18,

160 032 42 Pribylina, Vysoké Tatry – Slovenská full.nkp.cz/nkkr/NKKR9903/9903109.html>. republika, 23.–27. júna 2002. Martin: SNK, 23. UHLÍŘ, Zdeněk. Standard MAS- 2002, p. 47–51. TER: katalogizace rukopisů v XML. Národní 21. UHLÍŘ, Zdeněk. Projekt “MASTER” knihovna: knihovnická revue. 13, 2002, Nr. 2, a problematika elektronického zpracování p. 84–101. ISSN 1214-0678 [accessed 4 Juny středověkých rukopisů. In Ikaros [online], 1999, 2008]. Access through Internet: . Access through Internet: . elektronicko-digitálního zpracování rukopisů a 22. UHLÍŘ, Zdeněk. Projekt MASTER a hybridní knihovna [The theory and methodol- standardizace v oblasti zpracování rukopisů. ogy of electronic-digital processing of manu- Národní knihovna: knihovnická revue. 10,1999, scripts and the hybrid library.]. Praha: Národní Nr. 3, p. 109–113. ISSN 1214-0678 [accessed knihovna České republiky, 2002. 324 p. ISBN 4 Juny 2008]. Access through Internet:

SKAITMENINIMAS – NE TIK ATVAIZDŲ GAMYBA ZDENĚK UHLÍŘ Santrauka Autorius straipsnyje aptaria senųjų istorinių trečia, įrankiai ir sistemos turi tarpusavyje de- dokumentų, tokių kaip viduramžių ir naujųjų rintis. Taigi skaitmeninimas – tai perkėlimas iš amžių pradžios rankraščių, inkunabulų (knygų, spausdintų dokumentų tradicinės informacinės, išleistų iki 1500 m.), senųjų spausdintų knygų komunikacinės ir žinių aplinkos į virtualią er- (iki 1800 m.), istorinių žemėlapių, jei reikia, ar- dvę. Toks perkėlimas humanitariniuose moks- chyvinių knygų, dokumentų ir pan. skaitmeni- luose taip pat reiškia paradigmos pokytį. Hu- nimą. Svarbiausia mintis, kad skaitmeninimas – manitarinių mokslų, ypač istorijos ir filologijos, ne tik atvaizdų gamyba, t. y. archyvavimui ir paradigmos pokyčiai matomi įvairiose vietose reprezentavimui nepakanka pagaminti istorinio ir vyksta įvairiomis formomis. Skaitmeninant dokumento atvaizdą. Reikia jį papildyti kitais istorinius dokumetus, keturi iš jų yra svarbiausi. duomenimis, t. y. metaduomenimis, taip, kad Pirmiausia, tai kodikologijos transformacija iš būtų galima perteikti visą informaciją apie is- vadinamosios knygos archeologijos į kiekybinę torinį dokumentą, suteikti prieigą prie jos, nes kodikologiją kultūros istorijos prasme; antra, skaitmeninimas – tai ne paprasta techninė veik­ nauja turinio tipų ir/ar turinio lygių koncepcija, la, o pasaulinė istorinių dokumentų sklaida. Pa- išreikšta IFLA dokumente Funkciniai bibliogra- grindinį skaitmeninimo tikslą padeda pasiekti finio įrašo reikalavimai (sutrumpintai FRBR); trys teoriniai pagrindimai ir techninės sąlygos: trečia, takaus teksto koncepcija, prieštaraujanti pirma, reikia atskirti duomenis nuo programi- ankstesnėms archetipo koncepcijoms, „Urtext“ nės įrangos; antra, standartizuoti duomenis; ir vadinamajai tiksliausiai formuluotei; ketvirta,

161 daugybinių versijų idėja, kuri buvo populiari menimis, reprezentuojančiais originalaus do­ pastaruosius dvidešimt metų. kumento struktūrą, ir techniniais duomenimis, Toliau autorius supažindina su Čekijos na- aprašančiais skaitmeninimo procesą. Išplėstinis cionalinės bibliotekos iniciatyva – skaitmenine MASTER leidžia jungti individualius analiti- biblioteka Manuscriptorium. Šiandien tai viena nius objektus/vaizdus/puslapius į vientisą do- didžiausių pasaulyje skaitmeninių bibliotekų, kumentą/virtualią knygą. Naudodamas METS pristatanti senuosius istorinius dokumentus. Ją standartą, vientisas dokumentas leidžia pirmiau- sudaro skaitmeninių dokumentų kūrimas ir jų sia sujungti kelis katalogo įrašus, reprezentuo- integravimas į įvairius kitus išteklius, naudojan- jančius tą patį dokumentą, antra, sujungti visus čius skirtingus duomenis ir metaduomenų stan- tekstus ir koreliuoti juos skaitmeniniais atvaiz- dartus. Manuscriptoriumo širdis – katalogo įra- dais, trečia, sukurti išsklaidytą dokumentą su šų, suderintų su MASTER standartu, duomenų sąsajomis į nutolusius atvaizdus. Tokiu būdu bazė. Kai partneriai naudoja kitus metaduo- Manuscriptoriumo skaitmeninė biblioteka gali menų formatus, jie konvertuojami į MASTER būti iš tikrųjų išsklaidoma bet kur virtualioje taip, kad, viena vertus, egzistuotų heterogeniška erdvėje. Šiuo metu ketinama šiai skaitmeninei terpė, o antra vertus, būtų kuriamas homoge- bibliotekai pradėti taikyti kompiuterinės lin- niškas centrinis katalogas. Paprastas aprašomasis gvistikos įrankius ir ontologijas, tačiau tai jau MASTER standartas šiai skaitmeninei biblio- ateities uždaviniai. tekai buvo papildytas struktūriniais metaduo- Įteikta 2008 m. gegužės mėn.

162