CHAPTER - V

DIGITAL LIBRARY; A CONCEPTUAL FRAMEWORK 5.1. Introduction:

The digital library is a recent term used to refer to information systems and services that provide electronic documents from dynamic or archival repositories. Within the past decade the number and types of digital information sources have been arrived to traditional libraries. There is a continues advancement in computer systems and communication technology. The computer and communication technology has resulted in a remarkable expansion in the ability to generate, process and disseminate digital information. These new developments have made new forms of knowledge repositories and information delivery channels. These channels are named as - electronic library - multimedia library - library without walls - hybrid library - information super-highway - digital library - virtual library

Though these terms are synonymously used, there is a subtle difference in each category. It is very difficult to grasp and understand the term 'digital library' in isolation. So a brief description on traditional library, electronic library, hybrid library and virtual library is necessary to understand the digital library.

The original vision of the library as propounded by nineteenth century pioneers like Melvil Dewey and Charles Cutter [1] was more than simply a set of pragmatic devices such as catalogue, classification system and reference desk procedures. It began in reality with a strong view of the cohesive and interrelated nature of knowledge itself. Other limitations

181 arose from the technology they had at their disposal and inadequate ideas about the user's habit of seeking information. Libraries constitute a physical space that holds collections. Libraries are also a space for learning and reflection - a public space that brings together diverse populations into one community to learn, gather information and reflect. Traditionally, libraries have been a collection of items stored in a site - specific facility. Access is limited to those who can travel to the library site or can an-ange a loan. Thus time and space define the nature of a library as a physical space.

The electronic library is a term, which has had a longer usage in the literature than 'digital library', and is associated with old fashioned approach. It generally indicates a rather limited approach to the digital library, simply indicating the provision of range of material in digitized form, witfiin the framework of traditional library provision. Berkeley Digital Library SunSite[ 2 ] defines it as, " An electronic library is a library consisting of electronic materials and services. Electronic materials can include all digital materials, as well as a variety of analogue formats that require electricity to use for example, video tapes are an analogue format that requires electronic equipment to view. Thus the term 'electronic library' encompasses all the material that can be held by a 'digital library', and is therefore more inclusive'. The essence of the electronic library is that documents are stored and can be used in electronic form rather than on paper and localized media.

The hybrid library is generally taken as lying somewhere on a continuum from the traditional to the digital library, with electronic and paper based sources used alongside one another. The challenge of the hybrid library is to integrate access to sources in a variety of formats and forms both local and remote resources. A hybrid library provides an environment and services that are partly physical and partly virtual. This model represents the typical 'real world' situation, with pragmatic access to information from

182 a range of media and formats, within an ideal of ever-closer integration and interoperability;

" It follows that most users will continue to be offered a mix of formats via a mix of delivery systems. The challenge for library managers is to create integrated services, which provide a 'seamless' service to the user".

The hybrid library has an element of physical provision and a physical location of material. The hybrid library should be designed to bring a range of technologies from different sources together in the context of a working library, and also to explore integrated systems and services in both the electronic and print environments.

The concept of the ' gateway library' seems to be a ' hybrid library'. Richard De Gannaro [3] of Harvard university explains that a gateway library does not replace the book collection with technology. Rather, the gateway like, Janus, looks to the documentary sources of the past, even as it looks in the direction of the electronic sources that will be increasingly available in the future. In these terms, the gateway library and the hybrid library are same. They describe the 'real world' situation where libraries provide access to a range of difficult media but also express the ideal of greater integration.

A digital library is a library consisting of digital materials and services. Digital materials are items that are stored, processed and transferred via digital (library) devices and networks. Digital services are services (such as reference assistance) that are delivered digitally over computer networks. Digital Library Federation [4] framed a broad definition of digital library, which reads as; "Digital libraries are organizations that provide the resources, including the specialized staff, to select, sfructure offer intellectual access to interpret, distribute, preserve

183 the integrity of, and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities". Joint Information Systems Committee [5] describes the digital library as "catalogues on-line, many through a standard v^eb-browser, university libraries also have increasing number of subscriptions to electronic journals and other on-line information sources. The library thus becomes the entry point to the collections, both physical and virtual available to the institution."

Both digital and electronic libraries can be virtual libraries only if they exist virtually - that is the library does not exist 'in real sense' for example, a virtual library can consist of material from a variety of separate libraries that are organized in a virtual space using computers and computer network. It is better to consider a digital library to be a library/information service, located either in a physical or virtual space or a combination of both, in which a significant proportion of the resources available to users of that services exist only in digital form. The term virtual library has been used at times, though with little consistency in meaning; it is often used to describe collections of web resources. Library of the future seems to be a catch-all terms, while library without walls has sometimes been used to refer, not only to digital collections, but also to outreach programs with physical materials.

These separate general models may be combined into one overall general model using Crawford's concept of the complex library. The figure 5.1 summarizes the relationship between traditional, electronic, hybrid and digital libraries in a two- dimensional space, which emphasizes relative degrees of distributed access and digital content.

184 Locally held collection Distributed collection

Analogue

Figure 5.1 locating the digital library' concept Digital

S.2. Evolution of Digital Libraries

In 1938, H. G. Wells dreamed of a world encyclopedia in which all-human knowledge would be available everywhere. This idea has been appeared in his famous novel 'the brain'. He narrated his dream very lucidly, stating that in future all human knowledge will be available in one book and it will be available at any corner of the world. In 1945 Vannevar Bush, [6] then director of the U.S. office of scientific research and development has written an article 'As we may think". This article is an elegantly written exposition of the potential that technology offers the scientists to gather,

185 store, find and retrieve information He was of the opinion that in future scholars will be consulting any book by tapping its code on a keyboard. The root of digital library can be found in Vannevar Bush's article, which is published in the July 1945 issue of the Atlantic nnonthly. Bush comnnented that, "our methods of translating and reviewing the results of research are generations old and by now are totally inadequate for their purposes. He discussed recent technological advances and how they might be used at some distant time in the future. He provided an outline of one possible technical approach, which he called 'Memex'. The interesting historical footnote is that the memex design is used in photography to store information. Bush had advocated the idea of networking.

Licklider [7] in 1965 has described the idea of digital library. He had coined the phrase the 'library of the future' to refer to his vision of a fully computer based library. He emphasized that the research and development is in need of building a true useable digital library. The roots of present day digital libraries may be traced to the information retrieval systems of the 1960 and the hypertext systems of the 1980. Digital libraries have evolved based on the techniques and principles developed by early information researchers like Mooers, Perry and Taube, The era of automatic indexing and search systems was pioneered by Salton. Digital libraries are built on the solid foundations of more than three decades of research in information retrieval. In 1998, Vice-President of USA Mr. Al Gore [8] stated that 'a new wave of technological innovation is allowing us to capture, store, process and display an unprecedented amount of information about our planet and a wide variety environmental and cultural phenomena ... I believe we need a "Digital Earth". A multi resolution, three-dimensional representation of the planet, into which we can embed vast quantities of geo-referenced data.' He then called on scientists to create a digital map of the earth as a resolution of one meter. Such a project will require technical innovation beyond that required even for a digital library

186 containing every book ever v/ritten. The area of the earth in square meters is about 5X10 (14). Storing two megabytes of data per square meter (v\/hich would include terrain data, imaging environmental and other pertinent information) will require 10(18) bytes, an amount roughly equal to the amount of digital storage cun^ently present on earth.

Since there were some limitations to traditional library systems, librarian and information scientist started adopting new technology to provide better access to their collection. For several centuries paper has been the primary medium for use in the conventional library system, because of its very attractive properties. New information handling techniques, storage and communication facilities have influenced the library system. One of the main reasons for the appearance of these new media is that they offer many types of facilities that paper based storage cannot offer. In his book on 'paperless publishing' Haynes offers a number of reasons for moving away from the use of paper towards the more extensive deployment of electronic media for the purpose of publications.

Digital conversion of library materials has advanced rapidly in the past few years. It promises to continue to expand its reach and improve its capabilities with extraordinary speed. Digitization has proven to be possible for nearly every format and medium presently held by libraries, from maps to manuscripts and moving images to musical recordings. The use of hardware and software for capturing an item and converting it into bits and bytes, matched by a quickly developing set of practices for describing and retrieving digital objects is giving form to the talk of a 'library without walls'. But such virtual library has a very real place.

The digital library may be based on an institution, but equally it could be based on a subject discipline, a vocation or profession, a region or even a nation. While such an entity may have or may not have a physical

187 location. It may be called into existence very rapidly, and dispersed equally rapidly, indeed it has been suggested that such 'limited life' digital libraries could be created as a response to, for example medical or environmental emergences. The true digital library also has within it the capability to disrupt and reconstruct the publishing and knowledge creation system. This being the case, providing realistic models for this type of library is much more difficult that for the previous two, purely on account of its dynamic, multifaceted and multi choice nature. One approach, which seems to encompass much thought in this area, suggests that a model of the digital library should comprise four structured levels a) User interface, b) Networks and communication, c) Information resources, and d) Reference service system Supporting five basic kinds of functionality i.e. digitization, large repositories, fast data transfer, privilege and management

North American Digital Library System [9] have identified following purposes of a digital library system

• to expedite the systematic development of: the means to collect, store, and organize information and knowledge in digital form; and of digital library collections; • to promote the economical and efficient delivery of information to ail sectors of the society; • to encourage co-operative efforts which leverage the considerable investment research resources, computing and communications network; • to strengthen communication and collaboration between and among the research, business, government, and educational communities; • to take an important role in the generation and dissemination of knowledge in areas of strategic importance to the society;

188 • to contribute to the lifelong learning opportunities and education in our country.

5.3. Digital Library: a conceptual framework: - Defining the term 'digital library' is a daunting task. Sixty different definitions on digital library are mentioned on Internet , but very few are commonly accepted and fairly straightforward. One common thread among all these definitions is that, there is a heavy emphasis on resources and apparent lack of emphasis on librarians and the services they provide. Out of these definitions some prominent definitions have been selected here to provide conceptual framework to digital library.

According to Borgman, [10] "Digital libraries are a set of electronic resources and associated technical capabilities for creating, searching and using information. In this sense they are an extension and enhancement of information storage and retrieval systems that manipulate digital data in any medium (text, images, sounds, static or dynamic images) and exist in distributed networks. The content of digital libraries includes data, metadata that describe various aspects of the data (e.g., representation, creator, owner, reproduction rights), and metadata that consist of links or relationships to other data or metadata, whether internal or external to the digital library..

He further states that "Digital libraries are constructed - collected and organized - by [and for] a community of users, and their functional capabilities support the information needs and uses of that community. They are a component of communities in which individuals and groups interact with each other, using data, information, and knowledge resources and systems. In this sense they are an extension, enhancement, and integration of a variety of information institutions as physical places where resources are selected, collected, organized, preserved, and accessed in

189 support of a user community. These information institutions include, among others, libraries, museums, archives, and schools, but digital libraries also extend and serve other community settings, including classrooms, offices, laboratories, homes, and public spaces".

Lynch, Clifford and Garcia-Molina, Hector [11] defined the term as "Digital libraries were viewed as systems providing a community of users yynth coherent access to a large, organized repository of information and knowledge...The ability of the user to access, reorganize, and utilize this repository is enriched by the capabilities of digital technology; ...digital libraries would, for the foreseeable Hiture need to span both print and digital materials and that the central issue was to provide a coherent view of a very large collection of information. In this sense, an emphasis on content solely in digital format is too limiting. Really, the objective is to develop information systems providing access to a coherent collection of material, more and more of which will be in digital format as time goes on, and to ftjily exploit the opportunities that are offered by the materials that are in digital formats. Additionally, the comprehensiveness and value of the collection accessible through a digital library system can be strengthened by the ability to integrate materials in digital formats that have not been well-represented, easy to access, or effectively usable in traditional library collections, such as multimedia, geospatial data, or numerical datasets. There is, in reality, a very strong continuity between traditional library roles and missions and the objectives of digital library systems".

Cleveland, Gary [12] compared it with traditional library and identified the following points. According to him • digital libraries are the digital faces of traditional libraries that include both digital collections and traditional, fixed media

190 collections. So they encompass both electronic and paper materials. • digital libraries will also include digital materials that exist outside the physical and administrative bounds of any one digital library. • digital libraries will include all the processes and services that are the backbone and nervous system of libraries. However, such traditional processes, though forming the basis digital library work will have to be revised and enhanced to accommodate the differences between new digital media and traditional fixed media. • digital libraries ideally provide a coherent view of all of the information contained within a library, no matter its form or format. • digital libraries will serve particular communities or constituencies, as traditional libraries do now, though those communities may be widely dispersed throughout the network. • digital libraries will require both the skills of librarians and well as those of computer scientists to be viable.

The most important judgement about digital libraries is that it is not be is a single or completely digital system that provides instant access to all information, for all sectors of society, from anywhere in the world. This is simply unrealistic. This concept comes from the early days when people were unaware of the complexities of building digital libraries. Instead, they will most likely be a collection of disparate resources and disparate systems, catering to specific communities and user groups, created for specific purposes. They also will Include, perhaps indefinitely, paper- based collections. Further, interoperability across digital libraries-of technical architectures, metadata, and document formats-will also only likely be possible within relatively bounded systems developed for those specific purposes and communities".

191 Seamans, Nan, & McMilan, Gail [13] explained the term. According to them the term 'digital library' is not merely equivalent to a digitized collection with information management tools. It is also a series of activities that brings together collections, services, and people in support of the full life cycle of creation, dissemination, use, and preservation of data, information, and knowledge. The challenges and opportunities that motivate an advanced digital library research initiative are associated with this broad view of digital library environment.

A digital library should be a seamless extension of the library that provides scholars with access to information in any format that has been evaluated, organized, archived, and preserved. Access to this evolving collection of digital information is provided through personalized systems as well as through the services of information professionals. The digital library adds value and saves time while shifting the times of access. It reduces need for proximity to information resources, but still emphasizes the quality of those resources. It is a library that can be individually customized and, ultimately, will be easy to use".

According to David Barber, [14]"Digital libraries are not exclusively the concern of large institutions or large research universities. Every library in any kind of organization that begins to move beyond bibliographic citations to providing online content for its patrons has begun to build a digital library. Thus, digital libraries can be built by public libraries, school libraries, and university and college libraries.

For a long time electronic collection building efforts have centered around information which is a step to other information. Whatever the value of this information, its online presence can not be said to constitute a digital library. These resources are simply tools to find content. It is this final i;u3i3tfiDtaod-4ts.flvailaMity in digital form that .—

192 content and its availability in digital form that distinguishes a digital library.

Actual digital library holdings, not just digital pointers to print resources, are what it takes to involve libraries in digital library building. The intended holdings need not be vast or managed by sophisticated software. A library that maintains a Gopher or WWW page with pointers to Internet resources (whether those be images, text, or other forms of information) has begun to build a digital library: it is evaluating and selecting among available digital resources and their delivery systems.

The resources in a digital library may be local or maintained remotely by a vendor like OCLC. They may be widely held like the popular periodical databases created by EBSCO, UMI, and lAC, or newer types of complex information resources like the geographic data files maintained by only a few institutions".

British Library [15], Digital Library Programme Team "The Digital Library System (DLS) will be an extensive and hospitable IT system to enable the Library to meet its strategic and operational objectives in relation to the collection of digital materials and the provision of access to them as follows:

• It will enable the Library to store, preserve and provide access to the UK digital published output, whether acquired through purchase or through extension of Legal Deposit. It will support the full volume projections for digital publications required for the Library's collection. It will be the basis of the UK national digital archive. • It will support greatly increased access to digital materials within the Library's Reading Rooms and remotely, the latter underpinning the

193 Public Library Network and contributing to ttie National Grid for Learning. • It will support licensed use of digital materials for remote document supply services - electronic journals, conference proceedings and patents - for the increased benefit of research and industry. • For wider public access it will support the digitisation of significant parts of the Library's unique historical collections undertaken with the aid of project funding.

... The digital library will consist of a critical mass of digitally held documents - words, still images, moving images, sound and any combination of these. These documents may be held in more than one place and their provenance may be more than one institution. Provision of the documents will be subject to agreement with and, as required, recompense to copyright and intellectual property owners. The material is and will be both current and historical, and in principle covers all subject areas".

Colorado Digitization Project of Colorado University [16] has provided a more comprehensive information on digital library. Definitions of the digital library range from narrow to broad, with some disagreement on the actual function of a digital library. Clifford Lynch defines the digital library as "an electronic information access system that offers the user a coherent view of an organized, selected, and managed body of information". This definition recognizes the more traditional role of a library in the digital world: at the very least, a digital library should offer services along with information, such as indexing or cataloging. Users of digital library resources may come to expect (and already do expect) more sophisticated functions, such as information and knowledge management services, resource discovery mechanisms, and personalization of access

194 and monitoring of new and existing digital resources. In his paper, "What is a Digital Library? Definition, Content and Issues," Harter discusses the properties of a digital library in terms of several categories on a scale from the narrow view to the broad. Some of the concepts he considers include: types of information resources: selection, organization, and authority control of information resources; authorship; physical and logical location of resources; access to resources; who are the defined user groups; and what services are offered and by whom (or what)? The Task Force on Archiving Digital Information makes a further distinction between digital libraries and digital archives. The Task Force defines digital archives as "repositories of digital information that are collectively responsible for ensuring, through exercising various migration strategies, the integrity and long-term accessibility of the nation's social, economic, cultural, and intellectual heritage instantiated in digital form". Digital libraries, on the other hand, collect and provide access to digital information, but they may or may not provide long-term storage and access to that information..

Committee on Computing, Information, and Communications National Science and Technology Council [17] clarified the digital library as conceptually, a digital library is analogous to a traditional library in the diversity and complexity of its collection. A single digital library contains many terabytes of information, distributed throughout the world. Digital libraries are designed to be used by a broad spectrum of the population ~ not only scholars and researchers, but ]educators, students, and the general public.

According to Edward Fox [18] 'The phrase 'digital library' evokes a different impressions in each reader. To some it simply suggests computerization of traditional libraries. To others, who have studied library science, it calls for carrying out of the functions of libraries in a new way, encompassing new types of information resources; new approaches to

195 acquisition (especially with more sharing and subscription services); new methods of storage and preservation; new approaches to classification and cataloging; new modes of interaction for patrons; more reliance on electronic systems and networks; and dramatic shifts in intellectual, organizational, and economic practices.

High Performance Computing and Communications Information Technology Subcommittee, IITA Task Group [19] stated that the remarkable expansion in the generation and dissemination of digital information in the last decade as a result of the availability of high-speed electronic networks has dramatically changed the nature and role of data archives and traditional libraries. Since the mid-1980s, information sources accessed via the Internet have multiplied rapidly. These include a mixture of data and knowledge sources in all electronically available forms: reference volumes, books, journals, newspapers, national phone directories, sound and voice recordings, images, video clips, scientific data as well as private information services such as stock market reports and private newsletters. These knowledge sources, when connected electronically through a network of networks, are the ingredients of a digital library.

Leiner, Barry M. [20] has identified the Digital Library as: • The collection of services • And the collection of information objects • That support users in dealing with information objects • And the organization and presentation of those objects • Available directly or indirectly • Via electronic/digital means.

The National Digital Library [21] envisioned digital library as a distributed collection of converted library materials and digital originals to which many

196 American institutions will contribute. The Library of Congress's contribution to this World Wide Web-based virtual library is called American Memory and is created by the Library's National Digital Library Program. The aim of the American Memory project is to provide a gateway to rich primary source materials relating to the history and cultural development of the United States. The ability to represent and store text and images in digital form and the existence of the Internet combine to permit access from libraries, classrooms, and homes across the country to resources that have previously been available only to those who visited the particular libraries and archives where the physical artifacts were housed. Because digital resources can be used remotely and repeatedly without damage or deterioration and can be presented in new ways as technology develops, the digital collections being assembled can be mined to fuel the educational process at all levels for generations to come.

According to Marchionini, Gary [22] the concept of a digital library has different meanings in different communities. To the engineering and computer science community, the digital library is a metaphor for the new kinds of distributed database services that manage unstructured multimedia data. To the political and business communities, the term represents a new marketplace for the world's information resources and services. To futurist communities, digital libraries represent the manifestation of Wells's world brain. The perspective taken here is rooted in an information science tradition. Digital libraries are the logical extensions and augmentations of physical libraries in the electronic information society. Extensions amplify existing resources and services, and augmentations enable new kinds of human problem solving and expression. As such, digital libraries offer new levels of access to broader audiences of users and new opportunities for the library and information science field to advance both theory and practice.

197 According to Maxymuth, John. [23] defining the digital library is not an absolute science. To begin, we could view a digital library as being analogous to a traditional library in that they both aim to provide access to a collection of materials to their users. The most idealistic proponents of digital libraries often see them as having nearly unlimited possibilities for providing the extensive multimedia resources for all levels of research worldwide to the screens of any connected users regardless of age or background.

National Library of Canada [24] has identified following points to be included in the concept of digital library • digitization projects (i.e., projects to convert substantial collections of books, manuscripts, articles, files, unique reference • tools / finding aids / indexes, photographs, illustrations, maps, sound recordings, video, 3-D objects, etc., to digital versions for Web access); • creation of original, completely new resources which did not previously exist (e.g., a virtual exhibition); • Web-based access to reference sources and databases. This includes organized, annotated subject directories with links to Web resources such as the Toronto Public Library's Virtual Reference Library or the Canadian Information by Subject Web site.

At the same time NLC excluded following kinds of projects or activities to avoid duplication with other inventories and directories:

• current publishing per se, such as publishing books, reports or magazines on the Web;

198 • home pages and on-line public access catalogues (OPACs) of libraries and other organizations (these are found in other directories);

• this includes simple Web sites noting institutional hours of service, locations, services available, publicity, etc.;

• 'NA^ite-page" directories, i.e., alphabetical directories of names and addresses;

• systems projects aimed at developing or implementing software, hardware, or telecommunications;

• projects which do not involve content creation, such as obtaining institutional or consortia licenses for electronic publications or providing licensed access to commercial electronic periodicals;

electronic resources (e.g., CD-ROMs) which will not be available via the Web;

research projects which will not produce electronic resources."

Patel, Rajesh [24] defined it as "Digital Library is a collection of electronic 'resources' that provides direct or indirect access to a systematically organized collection of Documents".

Saffady, William [25] "Broadly defined, a digital library is a collection of computer- processible information or a repository for such information. In non-library applications, the phrase has been widely applied to centralized repositories of computer programs or machine-readable data. That usage

199 has a long history, particularly in scientific and technical applications...a digital library is a library that maintains all, or a substantial part, of its collection in computer-processible form as an alternative, supplement, or complement to the conventional printed and microfilm materials that cuR-ently dominate library collections".

According to Sorkin Virginia D., & Farley, Judith. [26] "Digital libraries, like traditional ones, will select, acquire, catalog, make available, and preserve collections. The major difference will be that digital libraries will consist of machine-readable data. This implies that the traditional concept of a collection must be revised to accommodate materials that are accessible electronically. Digital content will be of two types: items created and existing primarily in machine-readable format and materials converted from the traditional formats (e.g., printed texts and pamphlets, images, motion pictures, and recorded sound)".

After examining the above definitions, one can differentiate these terms A *library* is an organized collection of items of various formats (books, journals, videos, CD-ROMs, etc.) along with the services required to make them available to a given user group or groups. It is not a collection of programming routines, at least in this context. An *electronic library* is a library consisting of electronic materials and services. Electronic materials can include all digital materials, as well as a variety of analog formats that require electricity to use. For example, videotapes are an analog format that requires electronic equipment to view. Thus the term "electronic library" encompasses all the material that can be held by a "digital library", and is therefore more inclusive. It is, however, out of style. A *digital library* is a library consisting of digital materials and services. Digital materials are items that are stored, processed and transferred via digital (binary) devices and networks. Digital services are services (such

200 as reference assistance) that are delivered digitally over computer networks. One of the best examples of a digital library is the U.S. Library of Congress American Memory collection. Both digital and electronic libraries can be *virtual libraries* if they exist only virtually - that is, the library does not exist "in real life". For example, a virtual library can consist of material from a variety of separate libraries that are organized in a virtual space using computers and computer networks. One of the best examples of a virtual library is the Networked Computer Science Technical Reports Library (NCSTRL). The term 'digital library' appears now to be preferred for a concept which is also described variously as virtual library, electronic library and library without walls These terms are essentially used interchangeably. But the concept of the digital library needs more careful definition. In the term "digital library" are two mportant words i.e. Digital, i.e., information in any digitized format, and Libraries, i.e., a total mechanism for obtaining access to, storing, organizing and delivering information. A digital library remains a library, with the same purposes, functions and goals as a traditional library. The digital part of the term indicates merely that the material is stored and accessed digitally. A digital library is therefore far more than a digital collection, particularly a collection, as is illustrated in the overwhelming number of the World Wide Web sites on the Internet, which consists of relatively volatile current information. The goal of the New Zealand Digital Library project was to explore the potential of Internet-based digital libraries, by which large collections of electronic, predominantly textual, documents, physically dispersed on computers the world over, which are accessible through a uniform interface that allows information to be located and accessed. The vision was to develop systems that can automatically impose structure on fundamentally anarchic, un-catalogued, distributed repositories of information, thereby providing users with effective tools to locate the information they need and to peruse it conveniently and comfortably..

201 After examining the above definitions, the following elements were found to be common among many of them Digital Libraries serve a defined community or set of communities may not be a single entity are underpinned by a unified and logical organizational structure incorporate learning as well as access make the most of human ("librarian") as well as technological resources provide fast and efficient access, with multiple access modes provide free access (perhaps just to the specified community) own and control their resources (some of v^rtiich may be purchased) have collections which are large, and persist over time are well organized and managed contain many formats contain objects, not just representations contain objects which may be otherwise unobtainable contain some objects which are digital ab origine

From above discussion following features can be enumerated

5.4 Salient features of Digital library: -

After detailed study of various definitions and opinions on digital library, following salient features can be generated i) Site neutrality ii) Open access iii) Greater variety and granularity of information iv) Sharing of information v) Up-to-date-ness vi) Always available vii) New forms of rendering viii) Everything can be stored ix) Universal accessibility 202 x) Modifiability

i) Site Neutralitv: - Users of the digital libraries get what they want at their office or at home. They do not have to travel to library access the material. The digital library has ubiquitous, anytime, anywhere access paradigm. Users of library digital library have access to the large collection, which is maintained and organized by different types of libraries and archives. There is a library wherever there is a personal computer with a network connection. ii) Open Access: - Powerful access and browse facilities invest the DLs with 'open access library' characteristics facilitating serendipitous discovery of information. Whatever is available on web is transparent to end users. iii) Greater variety and qranularitv of information: - Information is not limited to metadata, or bibliographic information, or text, or discursive information - but can include all digital objects; that can be digitized are potential DL content. Digital information generated by different organizations and institutions prove that the available information has variety and granularity. iv) Sharing of information: - Digital libraries enhance the sharable resources concept of traditional libraries. Marchionini capture this feature very aptly in his concept of 'serium'. Sharing of some information is a long tradition of libraries. Digital libraries have a special feature of sharing of information v) Up-to-date ness: - Currency of information, with no time lag between creation and availability. Many websites are regularly updated, fact finding and current information is always available on Internet. vi) Always available: - No library hours. Resources are available on 52 weeks, 7days and 24 hours. One of the main constraints of a

203 physical library is that of fixed library hours are overcome by DL systems. Since the doors of digital libraries are always open, time is not the barrier for users. They can acces the information at any hour of the day. vii) New form of rendering: - Rendering of information is not limited to 'text' or any one kind of symbols. Many disciplines, both in traditional sciences and humanities, in mathematics, formulae can be rendered in more imaginative and cognitively appealing ways than ever before. Similarly, authors in other areas such as chemistry, architecture, or sociology now recognize that digital library technologies offer them alternative way of expressing and rendering their content. viii) Everything can be stored: - In digital library not only printed books and journals are available in digital form but it also stores photographs, theatrical performances9including opera, ballet), legislative material, court decisions, museum objects, recorded music, speeches, movies and videotapes etc. ix) The total number of different books produced since printing began does not exceed one billion (the number of books now published annually is less tan one million). If an average book occupies 500 pages, then at 2000 characters per page, then even v»«thout compression it can be stored comfortable in one megabyte. Therefore, one billion megabytes are sufficient to store all books. This is 10^^ bytes, or one petabyte. At commercial prices of $20 million per gigabyte, this amount of disk storage capacity could be purchased for $20 million. So, it is certainly feasible to consider storing all books digitally. x) Universal accessibility :-ln the past, the concept of universal accessibility (or universal access (in DL research has been mainly related to requirements such as interoperability, networking, and intuitiveness in the users' interaction with the system. DL systems

204 are designed for the broadest possible end user requirements, skills and preferences (design for all, or universal design). Thus, universal accessibility raises inDplications for both the content and user interface of DL systems. Content accessibility encompasses two elements namely the assembly of the required (collection of) information entities, and the availability of the latter in alternative forms and media. On the other hand, user interface accessibility entails the provision of the required software and hardware components that are necessary for user to access the collections in a manner that is enabling effective, efficient, and satisfactory. Traditionally, the notion of accessibility has been loosely served by contemporary technological developments, since these were targeted to the average user in business environment. xi) Modifiabilitv: - Modifiability is the ability of a system to be extended to accomplish additional functionality. Such extensions in functionality usually aim to facilitate incremental growth in the scope of informational use, as well as long-term access to a DL. In certain cases, modifiability may entail provisions on behalf of the interactive computer based system, to account for individual needs, requirement and preferences, or contexts of use. In such cases, modifiability implies the ability to adapt to a changing environment, diverse user groups and different usage patterns. Such adaptations may relate to content (e.g. representing a piece of information in alternative media do that users can select the most preferable) or to the user interface. Though, both are necessary, user interface adoptions have been of primary interest to the HCI research community in the recent past. xii) Platform scalability: - Platform scalability, refers to the requirement that DL systems be extensible, so as to be capable of incorporating the next generation of technology. Scalability is important since DL systems are likely to have long life cycles and facilitate the

205 informational and situational needs of users across different generations. Thus, for instance, the multimedia content in a TV may need to be presented both to a scientist and to a young child, each accessing the library through terminals with different capabilities. Scalability should allow the DL system to provide the required information in such a way that is both comprehensive by the target users and customized to the designated context.

5.S Components of digital library

In spite of the variations across individual DLs, one can derive several components ('facets') that constitute a DL. The key components of a DL include:

• The information store

• Content creation and capture

• Search, display and access

• Distribution

• Rights management

• Interoperability

• Preservation and maintenance

• Economic, legal and societal aspects

• Planning and mangement This is summarized by T.B. Rajashekhar [27] which is shown in the figure 5.2. The core technical issues that one face in establishing and operating DLs are to do with the digital objects stored in a DL: How do one create them? How do one store them? How do one describe them? How

206 do one find the information contained in them? and How do one deliver them?

Manage meni

Economic, Legal and Societal Issues

Rights Manageixient

T

Content Creation /infonnationj Search and and Capture I Store / Access

'^ \ Pres ovation & EHsthbutiofl Maintenance

Components of a DL

Since the digital library is not adequately defined, there are different opinions about its structure and architecture. If one looks at it from the traditional library point of view, one can see the following important ingredients of digital library. a) Content, Digital objects and collections b) Design and architecture c) Resource discovery tools d) Interfaces

207 The term digital library is a broaderterm which involves evolved two words i.e. digital and library. The basic properties of these two words are shown in the following figure 5.3

Comp Network Content Colle Servi Commu uling ing ction ces nity

Figure 5.3 Ingredients of DL

Content is common in both the environment i.e. digital and library. One cannot undertake any type of digitization activity without content. One perspective of DLs is that they provide dynamic knowledge spaces for the loiowledge creation process. The flexibility offered by the digital medium to authors in representing and rendering their ideas is immense. Today it is possible to think of content or information beyond the traditional text largely due to this capability. New genres of documents are evolving. Text is enriched with images, sound and movies. The hypertext/hypermedia modes provide non-linear presentation possibilities. Tens of millions of content objects (speeches, music, poems, articles, books, sculptures, paintings, movies etc.) are being created all over the world. Efforts are being made to convert some of them in digital form. This requires capture and conversion to a digital representation suitable for the relevant media form. From this discussion following model is generated.

208 5.51 Content :- Digital collections can be built using a variety of strategies. The Digital Library requires a greater and more coordinated effort for creating digital objects. These can be achieved by following methods Digitization of existing materials Digitization refers to the creation of an electronic surrogate for a physical item. This is usually accomplished through scanning and document creation. As the fundamental architecture of a digital library becomes stable, existing materials will be digitized and added to the digital collection. Resources Created Outside the Library Building digital collections can entail more than digitization of existing resources. Purchasing or providing access to external resources can extend the scope of the library collection, either by eliminating duplication of effort at multiple locations or providing access to materials which the library may not be able to digitize. Acquisition of Original Digital Works Acquisition of materials "born digitally" These include materials such as electronic books, journals, or datasets created by publishers or scholars ttiat originate electronically rather than being scanned from paper or other fixed media. They provide a means of making available to users resources not already in the library in traditional formats. Providing Access to External Collections Providing links to external web sites, other library collections, or publishers' servers is also a method of increasing materials available to local users. However, libraries do not have long term conti-ol over items accessed from external collections; such items may be modified, discontinued, or allowed to stagnate without input from libraries that link to tiiem. Collection Development

209 A collection development plan for such resources, comparable to current selection criteria for traditional materials, is desirable. Issues, which might be considered: a) Criteria for selecting resources (value of information, technical specifications, etc) b) Choosing from multiple versions available through different aggregators, c) Choosing between print and online versions (or both), d) Choosing between linking to external resources vs. buying and maintaining online resources locally, e) Procedures for discontinuing print subscriptions for materials available in electronic format, f) Procedures for "weeding" obsolete materials.

A digital resource is defined as an information resource that is available in digital form. Electronic texts, databases, and spread sheets are commonplace while databases of digital images, hypertexts, digital sound and video recordings, computer simulations, and mixed media resources are only now coming into the main stream. Some data resources are static and unchanging; others are dynamic or regularly updated or amended. Data resources are made accessible to users in a variety of different ways. They may be hosted on remote computers and accessed by users via the Internet. Alternatively users may have to acquire the data, and mount and use them locally. Content has been always been center-stage in the digital library field. Creating content- the information that users seek and libraries help to provide - is at the heart of designing, developing, and building digital libraries. The various stakeholder's authors, publishers, users, librarians, and others are all tied to each other through the content. Authors have content to disseminate and distribute, while publishers and librarians add value to this content and its distribution. Communication through content creation and delivery relies upon supporting technology and techniques. The effectiveness of the message depends upon the representation and rendering of the digital content.

210 The content of digital libraries may be text, images, audio, video, computer programs, and other forms. Newly created content often is born digital, while older resources are typically digitized through some conversion process. Both must be represented digitally, so the issues of character encoding, formats, files, etc. are the main issues in the building of digital library content. Building digital content encompasses creation, capture, conversion, storage, organization, search, retrieval, presentation and re-use. Contents are of two types; one, which is developed in house by the libraries and the other, is procured from outside vendors. If the content is created in-house then one can take help of internal as well as external experts available in the organization. Sometimes it is very difficult to create the digital contents in-house, so they have to procure this from outside vendors. One should be very careful in selecting the contents from external agencies. Vijaykumar & Jeevan[28] comprehensively described the criteria to assess the externally procured content.

Authority of contents: - The authority of contents can be judged by taking into account these questions, Is it complete and internally consistent?, Is it coherent in relation to other related material?. Is there an authorized cananical version?. Another issues one has to consider are as intellectual level in which the subject matter is discussed; reputation of the publisher, compiler and indexer/abstractor in the field; style of the subject presentation, coverage, update, and language; comprehensiveness; Currency of the subject matter

User level: -

The digital documents either created through digital conversion or born- digital should be useful to the specified user community. It should be easy to use and its presentation should be legible. These documents are software dependents, so software should be user ft-iendly. At every

211 possible stage help should be provided by giving help messages and instructions. In case if any user comes across to any errors, the errors should be v\/ell documented.

Search capability: -

The search facility should be powerful. Users should be in position to search the database using Boolean and logical operators. Whenever it is required users can search the documents by particular author, particular title, particular date or range of records. Response time is very important ingredient one has to find how fast the information is displayed. Another important points should be considered are like Information exhaustiveness in records; mechanism and transparency by which the bibliographic to full text linking is guaranteed, searchable text fields: exhaustive indexing conducted to make searches amenable to different field. Graphic support etc. Most of time it is found that graphics require more time to download.

Display capabilities: -

Display capability of system can be judged with the help of following points a) Managing search results, features and support provided by the system to manage the query field. b) Display format: style and variation of displaying results is a sorted order. c) Sorting- does the system support arranging search results in a sorted order. d) Avoiding errors. How free is the system from typographic or other errors in display e) Appearance: Aesthetically designed colour combination and headings for display.

Documentation

Defiled information about the digital source is required for its full utilization.

212 a) Manuals: should be carefully designed and explanatory. b) Online help messages: Electronic version of the manuals as context sensitive messages in a pick access mode.

Technical support

It is experienced that technical support is available regarding the digital source, the technical support can be judged with the help of follov\«ng points; a) Speed: How much time requires to vendors to support? or. Have they got service points in your vicinity? b) Depth: Technical supports are provided up to what level c) Duration: How long services are encouraged? d) Nature: Are the services being provided free or on payment?

5.52 Community / users : -

The stakeholders the different communities of people who use and benefit from digital libraries - have different views of what DLs are and what can it do. Most government around the world has positive views about the ability of digital libraries to enhance 'equity of access' to information. Government perceives DLs as a means of overcoming the digital divide. Much of the digital library movement in the US as well as in other countries of the globe has stemmed from governmental initiatives and drive. Networks had developed and strengthen to support the building of National Information Infrastructure. Governments and government agencies perceives dLs as information infrastructure that enable them to realize the goal of the social mission of equity of access

The new medium of digital libraries is perceived with ambivalent notions by the publishing industry. For the publishing industry, digital libraries are new modes of distribution, and a new competitive market for their

213 continued growth. In view of the threat to their traditional roles markets, publishers are adopting the new paradigm of electronic publishing through integration of new media and new partnerships with other agencies and institutions. The symbiotic relationship between libraries and education is a classic case of collaboration and working in tandem. DLs have further amplified and augmented this relationship. For educators and teachers, digital libraries represent new learning resources, supported by a broadening of media centers and multimedia content. Many projects and initiatives have been undertaken to further the cause of education through digital library developments.. Governmental educational institutions, and other agencies and individuals, enthused by the benefits of digital libraries, have vigorously pursued their development. To the library community, digital libraries are a further step in the continuum of newer media of publishing, as well as newer technological and organizational framework for continuing and revitalizing their mission of accessing and disseminating information and knowledge. Libraries and the librarian community have always embraced and adapted themselves to the changing technologies and societies. Today, librarians look at DLs as means for more direct involvement in the dissemination process. In order to obtain the desired data, a user may expect to simply use a GUI interface or a metaphor that replicates a physical library by computer graphic technologies. However, a digital library provides services to so many kinds of users that it requires a specific user interface for each user class. This variety is the most important problem to resolve in developing user interfaces.

5.53 Digital library services:

One of the first uses of computerization was for the compilation of library catalogues. At first computers were used as a part of the printing process, and later they entered into the process of designing on-line catalogues. Large indexes and abstract publications have followed a similar path,

214 going from print, to print via computer, to on line and to CD-ROM. Computer technology has also led to the development of search tools such as citation indexes and concordances which are produced automatically. A number of search processes, such as Boolean operators, have been developed and refined over the last fev\/ years. The library services which are extended and provided in the digital library environment include answer reference questions, conduct statistical analysis, cerate customized maps, provide help with search tools, provide readers' advisory services, provide access to and assistance with commercial service.

Digital Library Services (DLS) at the University of Michigan, launched in 1993, is a campus-wide program focused on the development and maintenance of digital resources and provides principal technology management services and support for the University Library system. DLS focuses on: developing and managing digital collections and access tools; providing electronic publishing capabilities and services for UM units and individuals; providing and sustaining a comprehensive computing environment for University Library patrons and staff; providing frameworks and systems to federate distributed information resources; and serving as a catalyst for addressing electronic information issues on campus. DLS is organized into four units: a) Library Systems: responsible for managing the Library's online catalog and related tools; b) Desktop Support Services: responsible for managing the computing infrastructure of the c) Librarv:Digital Library Production Service: responsible for managing and developing digital collections; and, d) Scholarly Publishing Office: the Library's digital publishing office, responsible for the development of tools and mechanisms to enable electronic publication and dissemination of journals and monographs,

215 including the creation and development tools and services to support this activity.

The Collaborative Digital Reference Service (CDRS) [29] provides professional reference service to researchers any time anywhere, through an international, digital netv/ork of libraries and related institutions. The service uses new technologies to provide the best answers in the best context, by taking advantage not only of the millions of Internet resources but also of the many more millions of resources that are not online and that are held by libraries. CDRS supports libraries by providing them additional choices for the services they offer their end users. Libraries can assist their users by connecting to the CDRS to send questions that are best answered by the expert staff and collections of CDRS member institutions from around the world. Local, regional, national, and global: the library tradition of value-added ser

CDRS is currently a "library to library" network for asking and answering reference questions. The three main components of CDRS are:

1. Member Profiles: member strengths and features.

2. Request Manager: software for entering, routing and answering reference questions.

3. Knowledge Base: a searchable database of question and answer sets (status of the Knowledge Base).

5.54 Computer and communication networks

The general process of digitization has resulted in the emergence of a new feature: the convergence of telecommunications, computer networks. The detailed information about computer and digitizing tools are covered in chapter three. Efforts have been made to explain the new methods through digital information can be reached to a large part of users.

216 According to World Cmmunication Report.[30] following types of networks would be most useful to provide digital information to a larger community.

Cable networks Cable TV uses a wireframe support (coaxial cable or optic fibre) in order to transmit several programmes simultaneously from the operator of a specialized TV station to subscriber households. With a few exceptions, the geographical coverage of cable TV channels in practical terms is confined to urban areas, owing to constraints in the field and criteria of economic efficiency. Cable TV operators today are fully aware of the new capabilities of satellite retransmission which, through data compression, have boosted the range of programmes on offer and brought new interactive services into being. To meet the competition from satellites, cable TV operators use various transmission resources and data compression to enhance their capacity on coaxial cable networks. They also rely on new optic fibre architectures and on progress in Asymmetrical Digital Subscriber Loop technology (ADSL). This is a digital transmission technique provided by telecommunications operators to broadcast pay- per-view video over their copper wire telephone network, hence the name Video Dial Tone (VDT). The process makes it possible to transmit up to four televisions signals at the same time as telephone conversation. In other words, the subscriber can talk on the telephone and watch a video programme transmitted at the same time over the same telephone line. The advantage of this type of technology is the ability to offer pay-per-view video type services over the Teleph one network without any major investment, as long as the home of the subscriber and the telephone switchboard are both equipped with fransceivers. The change, which has had the greatest impact on the current radiobroadcasting environment, is without doubt the emergence of optic fibre (light modulation) cable as a potential means of transmitting communication signals. Thanks to optic fibber, telephone companies and cable TV operators can provide

217 broadband video services which v\/ould be impossible v\flth conventional metal wiring networks, or, at any rate, only at greater cost and with less efficiency.

Digital terrestrial networks Terrestrial broadcasting is still the main means of telecasting, enabling more than a billion television viewers worldwide to receive their programmes, despite the developments in other telecasting means such as cable or satellite, which offer greater capacity in terms of the number of channels available. Conventional technology has made a comeback through the new possibilities provided by a technology for digital data compression known as MMDS. This process leads to improved picture and sound and above ail makes it possible to broadcast four times the number of programmes received through the conventional analog technique, as well as enhancing spectrum management. In concrete terms, television Satellite Uplink Downlinks Cabled network Public Broadcasting station Transmitter Cabled network head Satellite broadcasting programmes are picked up via a dish antenna which receives a multitude of programmes. A specific MMDS technology transmitter with a range of 5 to 10 kilometres broadcasts the pictures. In surrounding villages or rural areas, a small receiver antenna can distribute the signals to between ten and fifteen households. Broadcasting each digital microwave stream, however, will require the installation of a special network of transmitters and relay links. The investment may be competitive compared with the cost of a satellite repeater, which is generally rented at around $600,000 per year. The technique is currently developing at high speed in regions such as Africa, the Middle East and the Russian Federation, as well as central and Eastern Europe, where homes are too widely scattered for the installation of cable networks to be economically viable. There are a number of digital terrestrial broadcasting projects currently under way. The British Government was the first to

218 propose the introduction of digital ten'estrial broadcasting in the United Kingdom as early as 1998.

Mobile networks Mobile communications are booming today, due to their ability to 'free' the end-user completely of wire connection consfraints. Mobile systems form the terminal links in a communication network, and provides the multi­ purpose and integrational functionalities required by the recent trends in technological convergence. A mobile telephone equipped with a modem and connected to a portable computer can thus be converted into a genuine mobile office, since rt can receive and send faxes and files, and access e-mail services from anywhere around the world. It is in the cellular telephone market, however, that growth in mobile communications has been greatest. In the developed countries, which are well equipped with stationary communication systems, the cellular telephone matches the diversity of professional and personal uses and requirements. In the developing countries, it represents an advantageous solution, capable of making up for the shortcomings of the conventional telecommunications infrastructure, particularly in underprivileged areas or those with a low population density, (n general, the mobile communications market has one of the highest expansion ratios in the telecommunications sector, with an annual growth rate close to 20 per cent. In all, the number of subscribers worldwide should rise from 40 million in 1995 to 100 million by 1998. Mobile phones are thus switching from being upmarket to mass market. That development is likely to be to the detriment of wireframe networks, which may be abandoned by a vast group of end-users in favour of a telephone number that will follow them wherever, they go.

5.6 Advantages of Piqitai Libraries: -

Digital libraries are less expensive.

219 Digital libraries are less expensive because they do not need much physical space required by conventional libraries. All information could be stored on CD, hard disk or other digital media in a small building.

Capability to access all information from a remote place. If the system set up is well, then computers from all over the world can be able to communicate with each other and access digital libraries. There is no need for a physical building for users to visit Instead of traveling to another place to get information; the information can travel to the user.

Digital storage of documents allows efficient and user-friendly displays of information. The information can be arranged in any fashion as user desires. Graphics and multimedia techniques can be used effectively in user interfaces to display information as well.

Round the clock access to knowledge through personal computers.

Access to full text, not just surrogates, full text of book, articles, key-word search etc.

Equal treatment to different forms of material, digital books, journals, maps etc.

Potential for establishing links between works in different forms and related works.

5.7 Limitations to digital library

a) Computer, Internet essential are essential to use digital library.

220 b) Reading from screen can not be done for longer time. c) Printing is costly from printer (running cost is more) d) Infrastructure is costly (startup cost is high) e) Personnel are not frained for use and maintenance f) Definitions are not self-explanatory g) There are many scripts; conversion to digital is a problem h) Impossibility of answering some questions in natural language i) Conversion cost of old material is more j) Conversion of complex document is a problem k) Images are exfremely difficult to classify and index I) Problems about copyright m) Information is volatile.

5.8 Digital Libraries Initiative - (DLI-D Phase -I

The initiative's focus is to dramatically advance the means to collect, store and organize information in digital form and make it available for searching, retrieval, and processing via communication networks in user- friendly ways. Digital Libraries store materials in electronic format and manipulate large collections of those materials effectively. Research into digital libraries is research into network information systems, concentrating on how to develop the necessary infrastructure to effectively mass- manipulate the information on the Net. The key technological issues are how to search and display desired selections from and across large collections.

There are a number of places where appropriate research, development and forward thinking are being done. The digital libraries initiative is a funding exercise of various agencies of the U.S. government to promote research Into the technologies and implementation sfrategies underlying future digital libraries. Phase I was funded by the: - • National Science Foundation (NSF)

221 " Defense Advanced Research Projects Agency (DARPA) • National Aeronautical and Space Administration (NASA) In the first phase six libraries were funded and were led by major universities. The aim of these project was focused to dramatically advance the means to collect, store, and organize information in digital forms and make it available for searching, retrieval and processing via communication networks in all user -friendly ways. A common strategy in all of these projects is to emphasis research partnerships. The initiatives will both capitalize on advancements made to date as well as promote research to further develop the tools and technologies needed to make vast amount of useful information available to large number of people with diverse information needs.

5.81 University of California. Berkley. The environmental electronic library [31] (httD://elib.cs.berkelv.edu/) project was a part of NSF, DARPA, NASA digital library initiatives and part of the California Environmental Resources Evaluation System. Research at Berkley includes faculty, staff and students in computer science division, the school of information management and systems and the Research program in environmental planning and geographic Information systems, as well as participation from government agencies and industrial partners. The project's goal was to develop the technologies for intelligent access to massive, distributed collections of photographs, satellite images, maps, full text documents and 'multivalent' documents.

5.82 University of California. Santa Barbara. The Alexandria Project (Maps) (http://alexandria.sdc.ucsb.edu/index.hml) [32] project about map digitization was a consortium of researchers, developers and educators, spanning the academic, public and private sectors exploring a variety of problems related to a distributed digital library for geographically referenced information. Distributed means the library's components may

222 be spread across the Internet as well as co-existing on a single desktop. Geographically referenced means that all the objects in the library will be associated with one or nrjore regions, (footprint) on the surface of the Earth. The centerpiece of the Alexandria Project is the Alexandria Digital Library (ADL), an online information system inspired by the Map and Imagery Laboratory (MDL) in the Davidson Library at the university of California, Santa Barbara. The ADL currently provides access over the World Wide Web to a subset of the MIL's holdings as well as other geographic datasets.

5.83 Carnegie Mellon University. Intermedia (Video) [33] (http://www.informedia.cs.cmu.edu) The Intermedia Digital Video Library is research Initiative at Carnegie Mellon University funded by the NSF, DARPA, NASA and others that studies how multimedia digital libraries can be established and used. Intermedia is building a multimedia library that will consist of over one thousand hours of digital video, audio, images, texts and other related material. Automatically encoding, segmenting and indexing data populate Informedia digital video library. Research in the area of speech recognition image understanding and natural language processing supports the automatic preparation of diverse media for full-content and knowledge based search and retrieval. The Informedia project has pioneered new approaches for automated video and audio indexing, navigation, visualization, search and retrieval and embarked them in system for use in education, information and entertainment environments. Informedia is one of six Digital Libraries Initiatives Project.

5.84 University of Illinois. Urbana Champaign [34].(http://dli.grainger.uic.edu/idli/idli.htm) This project was based on the new Grainger Engineering Library Information Center at the University of Illinois in Urbana Champaign and

223 will be centered on journals and magazines in the engineering and science literature. The testbed includes a customized version of NCSA Mosaic •^, software development at the National center for supercomputing applications under NSF and DARPA sponsorship to help users navigate the World Wide Web. This testbed has become a production facility of the university library with thousands of documents and tens of thousands of users across the University of Illinois and other big ten universities. Research based in the Graduate School of Library and Information Science encompasses sociological evaluation of the textbed, technological development of semantic retrieval and prototype design of future scalable information systems.

5.85 University of Michigan. Intelligent Agents for Information Location [35] (http.7/www.si.umich.edu/UMDL) Much digital library work has begun from the centralized, structured view of a library and sought to provide access to the library through digital means. In the university of Michigan Digital Library Project (UMDL), the above approach loses the advantages of decentralization (geographic, administrative) rapid evolution and flexibility that are hallmarks of the web. In UMDL, they are embracing the traditional value of services; organization and access that have made libraries powerful intellectual institutions.

The challenge of providing an infrastructure lets patron (and publishers) feel that they are working within a library, with the traditional emphasis on providing service and organized content, when in fact the underlying space of goods and services are volatile, administratively decentralized, and constantly evolving. Moreover the decentralized and flexible infrastructure can be exploited to allow information goods and services to evolve in a much more rapid, diverse and opportunistic way than was ever possible in traditional libraries, for the good of consumers and providers.

224 In the UMDL< they are meeting these challenges by defining and incrementally developing interfaces and infrastructures for users and providers such that intellectual work (finding, creating and disseminating knowledge) is embedded in a persistent, structured context even though the underlying networked system is evolving. The core of the UMDL has been the agent architecture that supports the teaming agents to provide complex services.

5.86 Stanford University Infobusina (http://wvyw.diglib.stanford.edu) The Stanford Digital Libraries Project [36] is one participant in the 4 - year, $24 million Digital Library Initiative, started in 1994 and supported by the NSF, DARPA and NASA. In addition to the ties with the five other universities that are part of the project, Stanford also has a larger number of partners. Each university project has a different angle of the total project, v^nth Stanford focusing on interoperability. The collection of this Dl is primarily computer literature. However, they also have a strong focus on networked information sources, meaning that the vast array of topics found on World Wide Web are accessible through there as well. At the heart of the project is testbed running the "infobus" protocol which provides a uniform way to access a variety of services and information sources through "proxies" acting as interpreters between the Infobus protocol and the native protocol.

With the Infobus protocol running under the hood, a variety of user level applications provide powerful ways to find information using cutting-edge user interfaces for direct manipulation or through agent technology. A second area of focus for Stanford Digital library Project Is the legal and economic issue of a networked environment.

225 5.9 Digital Libraries Initiative (DLI -II) Phase -2

Digital Libraries Initiative Phase 2 (DLI-2),[37] compared with the first set of projects which began in 1994, is a larger and broader effort. It received around three times as many proposals (230 requesting over $400M), and they went to a management group of more than twice as many government agencies. The 24 funded projects cover a substantially wider range of subjects and media, and the program involves about twice as much money in total as the DLI-1 round of projects five years ago. The increase in activity, sponsorship, and breadth reflects the success of the field and, in particular, the success of the DLI-1 projects and the public attention and interest they achieved with their results. We can only regret that funding limits prevent still larger and more ambitious projects. Most important administratively is the expansion of the group of government agencies sponsoring the program. DLI-2 is an effort of the:

• National Science Foundation (NSF)

• Defense Advanced Research Projects Agency (DARPA)

• National Library of Medicine (NLM) • Library of Congress (LOC)

• National Endowment for the Humanities (NEH) • National Aeronautics & Space Administration (NASA) • Federal Bureau of Investigation (FBI)

In partnership with the:

• Institute of Museum and Library Services (IMLS) • Smithsonian Institution (SI) • National Archives and Records Administration (NARA)

The new agencies joined the program as a result of seeing the DLI-1 results, and their participation has permitted widening the efforts in digital libraries, particularly into the medical and humanities disciplines. This is a clear instance of positive feedback operating: good research results

226 attracted more supporting agencies and more financing.DLI-2 have projects addressing new kinds of media: sound recordings of the human voice at Michigan State University, music at Johns Hopkins University, political and economic data at Harvard University, and a combination of software and data at the University of South Carolina. These join with continued study of video materials at Carnegie-Mellon University, images at several places including the University of California Santa Barbara and Stanford University, and textual materials as parts of nearly all projects. Several projects, including those at the University of California Berkeley and Tufts University, combine several kinds of media.

The new projects also deal with content in new subject areas: anthropological models and images at the University of Texas, literary manuscripts at the University of Kentucky, patient care at , and folk literature at the University of California Davis. These projects also involve new technology, so that, for example, the Tufts University project extends the digital library effort both into the domain of classical studies but also will look at ways to involve mapping and imaging information together with text. And the University of Kentucky is looking at new ways of digitizing literary manuscripts as well as new ways of using them.And, of course, there are new technological areas being explored, such as interoperability and security questions at and Stanford University, automatic classification at the University of Arizona, information filtering at the University of Indiana, and a new and particularly interesting area, data provenance, at the University of Pennsylvania. Again, many projects extending subject areas or media are also expanding the technological reach, as at Columbia University where new summarization methods are being created in the medical area of patient care information.

Needless to say, there is a great deal of other work in the United States and around the world on digital libraries. The Library of Congress, the

227 Digital Library Federation, and various private foundations such as the Mellon Foundation support very important efforts in the digital library effort. And the combined efforts of a great many universities with internally funded digital library v\/ork are much larger than any of the centrally organized programs. Many of these efforts are coming together now, most notably in the state of California where the "Interlib" name refers to the combined efforts of the Federally funded research projects and the state- created California Digital Library. Some of these other efforts fill in the gaps left in the DLI-2 awards, most particularly in the area of economic experimentation to help us understand what will be the long-term organizational and financial basis of digital library services.

Perhaps the most significant impact of the Federal agency digital library effort is not the specific projects today, nor the spin-offs from previous work (the Lycos and Google search engines, for example, trace their ancestry to DLI-1 awards), but the researchers involved. Some senior scholars in other disciplines, who could easily continue their careers in the areas in which they have been working before, changing to do digital library research. This happened with Hector Garcia-Molina and Robert Wilensky in the earlier set of awards, and now we see senior professors such as Sidney Verba and Gio Wiederhold joining DLI-2 projects, . Attracting researchers into a field is more important than choosing the subject areas of the research. Research is inherently unpredictable, but wrtth people such as our new awardees working in the field, we can be confident that the outcomes will be significant and beneficial.

Digital Libraries Initiatives phase two is a multi agency initiative which seeks to provide leadership in research fundamental to the development of the next generation of digital libraries, to advance the use and usability of globally distributed, networked information resources and to encourage existing and new communities to focus on innovative application areas.

228 Since digital libraries can serve as intellectual infrastructure, this initiatives looks to stimulate partnering arrangements necessary to create next generation operational systems in such areas as education, engineering and design, earth and space sciences, biosciences, geography, economics and the arts and humanities. It will address the digital libraries life cycle from information creation, access and use, to archiving and preservation.

Research to gain a better understanding of the long term social, behavioral and economic implications of and effects of new digital libraries capabilities in such areas of human activity as research, education, commerce, defense, health services and recreation is an important part of this imitative.

The detailed information on these project including home page is given in appendix - B

5.10 Digital Library examples

Columbia University Averv Librarv:-

One of the earliest imaging projects undertaken by a library was Project AVIADOR (Avery Videodisc Index of Architectural Drawings on RLIN), [38] launched by the Avery Library of Columbia University in 1985. The primary goal of the project was to provide machine-readable cataloguing information on 41,000 architectural drawings and incorporate an image of each drawing on an elecfronic storage device. The intent was to allow users to flip through whole collections or go directly to a specific drawing, all without handling the originals. Not only would access to Columbia's historic collections of architectural drawings be significantly improved, but also the project would help to preserve them.

229 The drawings in the initial project represent 40 prominent archival collections at Avery, including the works of such well known architects as (c.546 items), those of Wright's mentor, Louis Sullivan (c.175 items), those of the popular rendered, Hugh Ferris (c310 items) and a sample of McKim, Mead & White (c, 560 items). The smallest collection catalogued during the project is that of James Kenwick, Jr. (58 items) and the 8,248 drawings from the collection represent the evolution of his firm over many years.

AVIADOR was launched when imaging technology was still new. Rather than relying on relatively new digital scanners and optical storage devices, the decision was made to photograph the materials and store them on analogue videodiscs with the intent to create digital images from the high- quality photo or negatives at a later date. AVIADOR was designed for use with a personal computer connected to the Research Libraries Information Network (RLIN) mainframe. RLIN devised a program for the interface that connected each of the 41,000 still-frame images of drawings on the videodisc to the catalog record on RLIN.

The user could: - a) Search for the appropriate bibliographic record(s) in RLIN using a variety of textual indexes (such as architect, title, building name, geographic location and so on). b) When a desired record is identified and displayed in the IBM PC monitor, place the cursor at the accession number(s) for a particular drawing within the RLIN record. c) Then call up the corresponding visual image(s) on the videodisc monitor by pressing the appropriate function key on the IBM PC keyboard. Alternatively the user could: -

230 a) Begin with the graphic index by scanning the images on the videodisc. b) When a desired image has been identified and displayed on the video monitor retrieve the corresponding bibliographic record by pressing the appropriate function key on the IBM PC keyboard. Fifty copies of the videodisc v^ere produced at $500 each, including the necessary software ad a user manual. The image files were subsequently converted to digital format and stored on magnetic disk drives. Ford Motor technical Information Center Library One of the first imaging files linked to an automated library system was that of the technical Information Center (TIC) Library, [39] the major scientific and engineering library for the Ford Motors Co. In 1989 it sent out an RFP for a document-imaging system that would provide users with desktop electronic access to full-text of company technical reports and would also be integrated with the Library's online catalog. The library was already using COMSTOW information Services' BiblioTech online integrated library software for all library functions. The TIC staff had long recognized a need within Ford Motor Co. for a central depository of company reports for many users. The library had been maintained a collection of Research and product & manufacturing Engineering Staff reports, but there was no central location for reports generated by other research or engineering groups. As a result, valuable information was not shared as widely as it should be. The TIC staff saw optical storage and imaging technology as a means of providing the entire organization with access to company technical information without increasing the Library's need for physical space or adding another place to look for information.

The BiblioTech system was mounted on a Digital Equipment Corp (DEC) MicroVAX 39000 with 32 MB of RAM and more than 2 GB of magnetic

231 storage. The success bidder proposed capturing the images using digital scanners and storing them on CD-ROM. The library accepted the powerful and subsequently implemented an Interface between the BiblioTech system and PC-based CD-ROM devices containing the digital images.

National Agricultural Library Text Digitizing Program f401 A microcomputer based scanning system was installed at NAL in January 1988. More than4000 pages of non-copyrighted aquaculture publications were scanned and digitized to create both bit-mapped page images and ACSII text. The text was indexed using TextWire Plus from Unibase, and the resulting databases; "Aquaculture-I" was distributed on CD-ROM to the participating Land Grant Libraries in March 1989. The libraries' role was to evaluate the delivery medium, the retrieval software and the contents of the disk itself. In addition, a comparison of retrieval from ASCII text alone versus ASCII text with page images was conducted.

A second disk, "Food, Agriculture and Science" was produced by the Consultative Group on International Agricultural Research (CGIAR) and sent in September 1989 to both the land Grant sites and to CGIAR sites around the world for evaluation. It contained CGIAR materials nominated by the CGIAR suites themselves, uses Kaware2 retrieval software from knowledge Access, and includes both full text and graphics. The project was so successful that the CGIAR commissioned the production of an entire series on CD-ROMs.

In May 1990, a third database was distributed. The fourth and final evaluation disk contained more than 4,100 selected pages from NAL's large special collection on Agent Orange, produced by NATDP and entitled simply "Agent Orange". The next disk, released in late 1992, was "Aquaculture if. It contained 6,500 page images. After that came the "George Washington Carver" CD-ROM, containing 3,500 page images

232 from three reels of microfilm of Carver's papers, letters and illustrations. The microfilm is part of a 67-reel collection produced by Tuskegee University. This is one of the first major scanning projects by the library from microfilm form.

The NAL subsequently installed two VTLS infoStations to facilitate the linking of its image databases with Its VTLS automated library system. The Infestation consisted of NEXT workstations with 2.4 Gb of hard disk; images were selectively downloaded from the CD-ROM onto the NEXT hard disk storage. Users search the bibliographic database on the VTLS system by author, title, subject or other access point at the InfoStation, determine that an image file exists for a bibliographic record and retrieve the image onto the InfoStation screen by clicking the "Retrieve Multimedia" button at the bottom of the windows display. Users can return to the bibliographic record screen when finished with the image file. The interface v\ras seamless-users need not log off one system to access the other, nor do they lose their place. However, there is one drawback- one cannot tell if an image exists for a bibliographic record by consulting the initial brief bibliographic record display, one must call up the full bibliographic record.

Peabody College Education Library (Vanderfailt University) The clipper project[41] of the Peabody College Education Library was a Mac-based optical storage system to allow easy retrieval of active reference files by end-users. Clipper used Macintosh computers both as file servers and workstations. MARS software from Micro Dynamics Ltd., through an interface customized for each of its clients, managed the storage and retrieval of the documents. The installation, made in 1988, was one of a few turnkey systems installed in a library by a major mutti industry-imaging vendor. The project was funded as part of by a hardware grant from the Apple academic Development Donation Program.

233 The image around 12,000 of them were scanned from material in 10 vertical filing cabinets; newspaper clippings, newsletter articles, pamphlets, working papers, reports and so on. Much of the information was current and local, from Tennessee, Nashville, or southern states. It also included lots of statistical information in the form of charts, graphs, tables, and statistics. The major components of Clipper were: • Input workstation - consisting of a Macintosh lix with an Apple scanner. Documents were scanned, their images temporarily stored on the hard disk and descriptive information about each document entered at the workstation. • Optical server - a Macintosh II with a pioneer 5.25 inch double WORM Drive attached. The Optical Server managed the storage and retrieval of scanned images that were achieved from the input workstation's hard drive to WORM disks. • Directory Server- a dedicated Macintosh SE/30 that stored descriptive information about the scanned images stored on the WORM disks. It acted as an index to the stored images. " Retrieve workstation - a Macintosh llx with a 19 inch Sigma Design monitor and LaserWriter SC for printing retrieved images. The workstation was set up in the reference room. • MARS (Multi-user archival and retrieval system) - software that allowed the network of Macintoshes to store, find and retrieve document images. • An Ethernet LAN that connected the clipper system to the campus network making Clipper available in offices and rooms outsides the library.

At&T: Since 1989, the AT & T Information Services Network [42] has been scanning internal technical memoranda, initially at 400 dpi (dots per inch) and storing the images on write-once-read many (WORM) optical disks.

234 Tens of thousands of documents were scanned, and new documents continue to be scanned. Initially, requests for copies were filled by printing the document on a 400dpi printer at a central site and sending it out via company mail. Although this system was a large improvement over its predecessor-where a clerk located the original document in a filing cabinet and made a xerographic copy - there was still a several-day delay before the requester received the requested copy.

When the AT & T network bandwidth was increased to 456 Mbps all constraints on the movement of images from the image server to desktop workstations was removed. Images are now retrievable via the company network.

Boulder Public Library: Boulder Public Library (BPL)[43] implemented photo image access in its online catalog in 1992, an enhancement to its CARL system public catalog, funded in part by the Boulder Public Library Foundation, the photo image access project offered a new type of access to visual materials. Designed originally to be used with the historical photograph collection of the BPL, the process was subsequently applied to other visual collections as well.

The process linked photo images to corresponding MARC records in BPL's bibliographic database, and allowed the PAC user to display the images on demand on a PC.

Project staff at BPL selected photos to be scanned from the collection and determine the bibliographic identification (BID) number for the corresponding MARC record via CARL systems' Bibliographic Maintenance. After determining scanning resolution according to a chart developed for the project by CARL, the staff member scanned the photo,

235 stored It in its own unique file, and saved the file, named with the appropriate BID. The image was stored in compressed format, at 640 by 480 pixels with a minimum of 32 gray values. The process supported both color and black-and-white photos, although color was slower to scan and required greater storage capacity.

To look at the photo image, the user searched the PAC as usual. Image display required a PC workstation with CARL system emulation software and a VGA monitor. When a bibliographic record with an image is linked to It was retrieved, the online message "imaged' appears as a part of the short display. The user could then select the full bibliographic record and, by responding to an additional prompt, display the image on the PAC screen.

BPL stored the photographic images locally on magnetic disks and made them available to PAC terminals connected to BPL's local area network.

The scanning process used a Microtek 300z scanning device and Picture publisher software. The scanning software ran in the Microsoft Windows environment. The scanning station required a 33 MHz 80386 pc compatible machine with 8Mb of memory and 200 MB of disk, and a mouse.

CARL systems and the Boulder Public Library Foundation jointly marketed the imaging process and linking software to CARL systems libraries and other interested parties.

Library of Congress Almost everyone is familiar wth the Digital Library Program of the Library of Congress (LC)[44] and its American Historical Collections component. From the millions of books, photographs, prints, drawings, manuscripts,

236 rare books, maps, sound recordings, and moving pictures held by the Library, only a small fraction are in digital form. American Memory, a major component of the Library's digitization program, offers multimedia collections of digitized documents, photographs, recorded sound, moving pictures, and text from the Library's Americana collections. Through a grant from Amerltech Foundation, the LC/Ameritech Digital Library Competition enables public, research and academic libraries, museums, historical societies and archival institutions to create digital collections of primary resource material to complement the Library's program. The Library also cooperates internationally to collect digitized laws, regulations, and other complementary legal sources in the GLIN project

The goal of LC preservation reformatting is to preserve the Library's collections and offer broad public access to at-risk materials. Digital reformatting is considered one among many options for crafting an integrated preservation strategy for collection materials that is developed in partnership v^nth Library curators and recommending officers, custodial divisions, and other Preservation Directorate staff.

The digitizing component of the preservation-reformatting program has three parts:

o Selection criteria

o Digital reformatting principles and specifications, including phased delivery, and

Life-cycle management of LC digital data

Meeting of Frontiers is a bilingual, multimedia English-Russian digital library that tells the story of the American exploration and settlement of the West, the parallel exploration and settlement of Siberia and the Russian Far East, and the meeting of the Russian-American frontier in Alaska and the Pacific Northwest.

237 The general public in both countries intends it for use in U.S. and Russian schools and libraries and. Scholars, particularly those who do not have ready access to major research libraries, also will benefit from the mass of primary material included in Meeting of Frontiers, much of which has never been published or is extremely rare.

The project grew out of discussions in 1997-98 between members of Congress, in particular Senator Ted Stevens of Alaska, and James H. Billington, the Librarian of Congress. The collapse of communism and the breakup of the Soviet Union created new opportunities for American educators and scholars to interact directly with their counterparts in Russia, as well as new demands in the United States for information about Russia. Nowhere was the new situation more apparent than in Alaska, where the end of the Cold War led to a revival of ethnic, religious, and economic ties going back to the Russian settlement of Alaska in the late eighteenth century.

The Meeting of Frontiers site was unveiled in December 1999. It included more than 2,500 items, comprising some 70,000 images, from the rare book, manuscript, photograph, map, film, and sound recording collections of the Library of Congress. Expansions of the site took place in September 2000, January 2001, May 2001, and December 2001, adding many thousands of items and accompanying explanatory text.

In November and December 1999 the Library of Congress concluded agreements with the Russian State Library (Moscow) and the National Library of Russia (St. Petersburg) regarding their participation in the project. In May 2000, joint Library of Congress-Russian teams completed the installation of high-resolution scanning equipment, on long-term loan from the Library of Congress, at both institutions. The Library of Congress also concluded a cooperative agreement with the Rasmuson Library, University of Alaska Fairbanks. The first digital images from the Alaskan

238 and Russian partner institutions-rare maps, photograph albums, and sheet music-were added to the site in January 2001.

5.11 Indian Digital Library Initiatives: Vidyanidhi (Indian Digital Library of Electronic Theses) Vidyanidhi [45] is project of Indian Digital library of Electronic theses undertaken by the Department of Library and Information Science, University of Mysore. It aims to establish a large online resource of Indian theses and develop mechanism to access full content via desktop computers and networks. This project is sponsored by the National Information System for Science & Technology (NISSAT), DSIR, Govt, of India.

Specifically, Vidyanidhi proposes to collect, catalogue and archive Indian theses and make them accessible worldwide through the Internet. It seeks to demonstrate the feasibility , mechanism and methods of ETDs in Indian context, considering the diversity of content, formats, subjects, languages and scripts. This project is a part of NDLTD (networked Digital Library of Theses and Dissertations) - a global ETD initiatives of Virginia Tech University, Virginia, USA.

"Down memory Lane" National Library of India. Calcutta (Digitization of rare and brittle documents on Compact Disks) [46]

The national Library of India holds a very rich collection of old and rare documents. Approximately 1.5 million documents require to be conserved by any method. In the first phase the library will scan 10,000 volumes of rare, brittle and ft-agile documents, covering 25 lakh pages. After proper checking, these scanned pages will be archived on CDs with necessary modification and indexing. The project had taken up to archive rare and brittle books of pre 1920. Presently it covers old English language, Bengali language and old government publications.(East India company, settlement report).

239 Following hardware and software is used to scan the documents.

a) A Pentium Server with 64 MB RAM, SOOMzz, 4Gb hard disk.

b) 12 nodes are attached to this server and peer to peer (Work group) network is used for this job.

c) Five scanners are used to undertake this job v\/hich includes, Hev^ett Packard (HP) make, i.e. three HP Scan Jet5100 & HP ScanJet 6100c , one high speed fujitsu scanner and a Contex AO size scanner.

d) HP CD-Writer Plus 7200 series.

e) Windows 98 is the operating system and 'Data scan' scanning software is used for this project which is developed by Stex Pvt. Ltd.

f) The national Library uses Gold CD to store the data. Datascan supports storage on CD, DVD, Hard disk and Floppies.

g) For this job, 12 staff had been provided by Stex Software Pvt. Ltd and five professionals by National Library Kolcata. Per day approximately 1000-1200 pages are scanned, cleaned and checked by the unit.

Datascan software offered very user-friendly retrieval techniques

• Information cab be generated in all 12 fields assigned to this / •=/ project

• Full text search facility

• Boolean search with 'AND OR and NOT.

240 Million Book Project (Carnegie Mellon Universitv)

The Million Book Project is a very prominent of Carnegie Mellon University. When completed this project will produce approximately 250 million pages or 500 billion characters of information. The storage requirements for the image files will be approximately 50 petabytes-an order of magnitude larger than any publicly available information base. Creating and managing such a vast information base poses many technological challenges and provides a fertile test bed for innovative research in many areas (described below). The MBP is a multi-agency, multi-national effort that will require the database to be globally distributed. For location Independent access, this globally disfributed database should appear to be a virtual central database from any place around the world. Mirroring the database in several countries will ensure security and availability. The network speeds at the various nodes would be different. Research in distributed caching and active networks would be needed to ensure that the look and feel of the database is the same from any location.

The search engines work on the principle of keyword matching and perform searches in one language at a time. With a large corpus of multilingual data provided by the MBP, along with multilingual summarization and translation tools, a well-directed research effort would be needed to ensure concept- and content-based retrieval of knowledge from across multilingual data.

The accuracy of Optical Character Recognition (OCR), even in some of the most developed languages, is hindered by the bad quality of the images. This is particularly true for older books and those that use ancient fonts for which the OCR is not tuned. Even the very best OCR accuracy of the order of 98% may not be acceptable in some cases. In order to obtain

241 an Improved accuracy close to 100%, advanced image processing research that will perform recognition beyond the character level will be needed. With the availability of large test data from the MBP and the exponentially increasing computing power of the microprocessors, well- directed image processing research would lead to near perfect optical recognizers.

In the new digital economy, providing democratic access to information while suitably and reasonably rewarding the innovator is possible. The largest repertoire of free software available on the web in many cases has been the outcome of state supported research. This free availability of software has in fact contributed to more developments and hence an exponential growth of knowledge. Even in literary and scholarly publications, authors have experienced increase sales of their work whenever they are made freely available on the web. This is in tune with observations in the new economy that the companies that make more and more of their software freely available on the web, have their market capitalization enhanced. The MBP, with its proposed plans to make a large knowledge base freely available, will provide useful statistics for testing many economic and sociological models. MBP has identified following Indian universities and organization for digitization of rare and unique material;

Indian Institute of Science, Bangalore International Institute of Information Technology Indian Institute of Information Technology Anna University, Chennai Mysore University, Mysore University of Pune, Pune Goa University, Goa Tirumala Tirupati Devasthanams, Tirupathi

242 • Shanmugha Arts, Science, Technology & Research Academy, Tanjore • Aailmigu Kalasalingam College of Engineering, Srivilliputhur • Maharashtra Industrial Development Corporation, Mumbai

So far 9771 (by March 2003) books have been digitized by these centers which is shown in the following figure 5.4

Bcxjks

• use BAKCE DSASTRA DTTD ^ • MIDC BPUNE BAU DKANCHI \ nCCL BSCL f

Figure 5.4 Progress of scanning in various centers

The researcher had acquired latest information at)out digitization efforts carried out in Indian Universities. Though there is much discussion on thses topics, practical work is negligible. The detailed analysis is provided in the next chapter. Efforts have been made to discuss the available digital library models which would be helpful to implement digitization activities at university level.

243 References :-

1. Miksa, Francis and Doty Philip. Intellectual realities and the digital library Available from: http://www.uts.cs.utexas/miksa.html (Accessed on 11 November 1999) 2. Berkeley Digital Library Available from http://sunsite.berkelev.edu/ (Accessed on 11 November 1999) 3. Rowlands, Ian and Bawden, David. "Digital libraries : a conceptual framevi^ork." Libri 49 (1999): 192-202 4. Digital Library Federation. A working definition of digital library. 1999 Available from: http://www.clir.org/diglib/dldefinition.htm (Accessed on 27 November 2000) 5. High Performance Computing and Communications Information Technology Subcommittee, IITA Task Group. (1997). Information infrastructure technology and applications: Report of the IITA Task Group. High Performance Computing and Communications Information 1997Technolo<;iy Subcommittee. 1997: Available from http://www.ccic.g0v/pubs/iita/2.3.html (Accessed on 27 November 2000) 6. Lesk, Michael. Practical digital libraries : books bytes and bucks. San Francisco, Morgan- Kaufmann, 1997. p.3 7. Witten, Ian H, and Bainbridge, David, How to build a digital library. New York, Morgan Kaufmann, 2003, p. 16 8. Lesk, Michael. Op. Cit. 9. Association of Research Libraries North American Digital Library System : definition and Purposes of a Digital Library 1995 Available from: http.V/www.libnet.sh.cn/digiib/definition.htm (Accessed on 22 June 2001) 10. Borgman, Christine L. "What are digital libraries, who is building them, and why?" In Digital libraries: Interdisciplinary concepts, challenges and opportunities edited by T.Aparac 1999. pp. 29 11. Lynch, Clifford & Garcia-Molina, Hector. (1995, August). Interoperability, scaling, and the digital libraries research agenda: A report on the May 19-19. 1995 IITA Digital Libraries Workshop. 1995. Available from: http://diglib.stanford.edu/diglib/pub/reports/iita- dlw/main.html#2 (Accessed on 14 August 1999) 12. Cleveland, Gary. Digital libraries: Definitions, issues and challenges. Available from: http://www.ifla.Org/VI/5/op/udtop8/udtop8.htm (Accessed on 13 September 1999) 13. Seamans, Nan, & McMilan, Gail. . Digital library definition for DL12.: 1998 Available from: http://scholar.lib.vt.edu/DLI2/defineDL.html (Accessed on 13 September 1999) 14. Barber, David. "Building a digital library: Concepts and issues". Library Technology Reports. 32 (1996): 573-617. 15. British Library, Digital Library Programme Team. The British Library Digital Library Programme: Towards the digital library. 1999 Available from: http://www.bl.uk/services/ric/diglib/digilib.html (Accessed on 13 September 1999) 16. Colorado Digitization Project. Digital toolbox glossary. 1999. Available from: http://coloradodigital.coalliance.org/glossary.html (Accessed on 20 September 1999) 17. Committee on Computing, Information, and Communications National Science and Technology Council. High performance computing and communications: Advancing the frontiers of information technology. 1997 Available from: http://www.ccic.gov/pubs/blue97/acc-diglib.html (Accessed on 20 September 1999) 18. Fox, Edward A., etal. Digital libraries. Communications of the ACM. 4 (1995i: 24- 45 19. High Performance Computing and Communications Information Technology Subcommittee Op.Cit. 20. Leiner, Barry M. The scope of the digital library: Draft prepared for the D. Lib Working Group on Digital Library Metrics. Available from: http://www.dlib.org/metrk;s/publicypapers/dig-lib-scope.html (Accessed on 20 September 1999)

246 21. Library of Congress. Introduction [to the Library of Conaress/Ameritech 1998/1999 guidelines: National Digital Library competition! 1998. Available from; http.7/lcweb2.loc,gov/ammem/award/gulde98.html. Accessed on 20 September 1999) 22. Marchionini, Gary. "Research and developments in digital libraries". In , Encyclopedia of library and information science v. 63, edited by Allen Kent . New York: Marcel Dekker., 1998, pp 259-279. 23. Maxymuth, John. (1999), "internet: Counting by ones and zeros". The Bottom Line. 12 (1999): 41-44. 24. National Library of Canada.. Inventory of Canadian digital initiatives: Scope. 1999 Available from:: http://wvi/w.nlc-bnc.ca/initiatives/.ecriteria,htm (Accessed on 25 September 1999) 25. Saffady. William (1995). Digital library concepts and technologies for the management of library collections: an analysis of methods and costs. Library Technology Reports. 31 (1995) .-223-224. 26. Sorkin, Virginia D., & Farley, Judith. "National digital library". In Encyclopedia of library and information science v.62 edited by Allen Kent .Nev^^ York: Marcel Dekker, 1998 pp. 216-228 27. Rajashekhar, T.B. Digital Library and Information Services in Enterprises (IS214) Available from: http://144.16.72.189/raja/ (Accessed on 12 June 2003) 28. Vijaykumar, J.K. and Jeevan, J.K. "Digital library developments: major issues of externally published contents." In Conference proceeding of Caliber 2001. Dept. of Lib. & Inf. Sc, University of Pune, 2001 pp. 174-184. 29. Library of Congress. Collaborative Digital reference service (CDRS) Available from : http://wvw.sls.lib.il.us/reference/por/features/2001/cdrs.html (Accessed on 29 June 2002) 30. UNESCO. World communication report. Paris, Unesco, 1999 Available from : http://www.unesco.org/webworld/wirerpt/vers-web.htm (Accessed on 24 August 2002) 31. University of California, Berkeley. The environmental electronic library Available from: http://elib.cs.berkely.edu/ (Accessed on 12 January 2001) 32. University of California, Santa Barbara^The Alexandria Project (Maps) Available from http://alexandria.sdc.ucsb.edu/index.hml (Accessed on 12 January 2001) 33. Carnegi6 Mellon University. Informedia (Video) Available from http://www.informedia.cs.cmu.edu (Accessed on 12 January 2001) 34. University of Illinois. Urbana Champaign Available from .http://dli.grainger.uic.edu/idli/idli.htm (Accessed on 12 January 2001) 35. University of Michigan. Intelligent Agents for Information Location Available from : http://www.si.umich,edu/UMDL (Accessed on 12 January 2001) 36. Stanford University Infobusina Available from: http://vww.diglib. stanford.edu (Accessed on 12 January 2001) 37. Digital Libraries Initiatives phase-2 Available from : http://www.dli2,nsf. gov/ (Accessed on 22June 2000) 38. Library technology report. "Digital library case studies," (2001) : 49-55 Available from : http://vyww.techsource.ala.org (Accessed on 24 November 2001) 39. ibid 40. ibid. 41. ibid. 42. ibid. 43. ibid. 44. ibid. 45. University of Mysore. Vidyanidhi: Indian digital library of electronic theses (pamphlet) Mysore, Dept. of Lib. & Inf. Sc. 46. Mujumdar, Uma. "Down memory lane : a project of the national library for digitization of rare and brittle documents on compact disks" In Conference proceeding of lASLIC Content management in India in digital environment. Kolkata, lASLIC, 2001 pp. 61-66 47. Balkrishnan, N, Universal library status. Hand out

247