Strategies for Preservation of Web-Based Content: Intellectual Property Barriers to Building the Sudan Web Archive

STRATEGIES FOR PRESERVATION OF WEB-BASED CONTENT: INTELLECTUAL PROPERTY BARRIERS TO BUILDING THE SUDAN WEB ARCHIVE A thesis submitted to the Graduate College at the University of Khartoum for obtaining the Degree of PhD in Information & Library Science BY Gafar Ali Fadol Ibrahim BA (Hons), (Information & Library Science), (1998) University of Khartoum MA, (Information & Library Science), (2002) University of Khartoum Supervisor Professor Radia Adam Mohamed Department of Information & Library Science, Faculty of Arts April, 2010 i © Copyright by Gafar Ali Fadol Ibrahim 2010 All Rights Reserved ii Acknowledgements The supervisor, Professor Radia Adam Mohammed, lecturer at the Department of Information and Library Science at the University of Khartoum, has provided invaluable advice and direction on every aspect of this study, as well as constant moral support. No Ph.D. candidate could have asked for more. She also made many useful suggestions and recommendations. Her keen intuition and innovative instructions add value and privilege to the thesis. Furthermore, I would like to express my appreciation and gratitude to the internal examiner, Professor Sami Sherif, Dean of Faculty of Engineering and Architecture (formerly Director of Information Technology and Network Administration) at the University of Khartoum for providing precious insights and cheerful welcoming, and to Dr Hamid El-Dood Mahdi Abd El-Rahman, lecturer of English Literature at the University of Al- Neelain for his valuable advices and for having reviewed the language of the text and Dr Ahmed Tawer, who reviewed the grammar of the Arabic abstract. My gratitude also extends to Dr Sabri El-Haj, Head of Information and Library Science Department at the University of Khartoum, Ms Wafaa Mohamed Osman, Director of Library and Documentation Centre at CWTOA, Dr Lennart Bjoneborn, lecturer at the Royal School of Information and Library Science in Denmark, for allowing quote from their dissertation and thesis and the external examiner Professor Ali Mohamed Shummu, Director of the Press and Printed Press Materials National Council, for their kind efforts. However, my colleagues at GHD Global, especially Debbie Mace (Administrator), Michael Kemp (Manager) and Kate Rabbits (Marketing Manager), who have been endless supportive in arranging for me to have the time to work on this study and providing me with access to office, laboratories, digital library and iii database facilities. Thanks to Mr Ibrahim Al Amin, acting-Director of the Sudan National Library, Ms Najuwa Mahmoud, Head of the Digitisation Project at the Sudan National Record Office and to all librarians and archivists who responded to the survey from worldwide online libraries and archives. Finally, without my family’s assistance and patience, it would not have been possible to complete this document. Their attention and endless endeavour to create calm study environment entitle them to be regarded as full contributors to its production. iv STRATEGIES FOR PRESERVATION OF WEB-BASED CONTENT: INTELLECTUAL PROPERTY BARRIERS TO BUILDING THE SUDAN WEB ARCHIVE A thesis submitted to the Graduate College at the University of Khartoum for obtaining the Degree of PhD in Information & Library Science By Gafar Ali Fadol Ibrahim Abstract Advancement in Web publishing creates new challenges to management of Web-based content as part of the national library holdings. Such content is important but ephemeral and prone to disappearance if no strategies are made to preserve it. Restrictions of intellectual property rights (IPRs) constitute main barriers to its preservation. This study aims to advocate a balance between the information seeking needs and the protection of IPRs for establishment of the Sudan Web Archive. The national Web space at ‘.sd’ domain is chosen as a setting for the empirical investigation. The survey and action research methods are employed in order to sample, identify and characterise statistical data that is collected, compiled, analysed and collated using SPSS 16.0 software. The survey method is reinforced by two instruments questionnaires and interviews. The action research method is used to find out a solution to the problems of the study, while the Ranganathan’s laws are revisited in the selection and collection development strategies. This study is placed within the context of Webometrics field as a new core discipline of information and library science. Results indicate that Alpha index measured by Cronbach is at (0.8) for all provided factors. A correlation is found v significant among pair of factors at (0.05). Results of three months of study show that 25% of the Web-based content is ephemeral and findings reached from similar previous researches show that annual rate of content disappearance is at 44% and the average life expectancy of a Website is only 44 days. It is found that the period from the death of the ‘.sd’ domain in mid-2001 and its resurrection in 18th November 2002 a great deal of national Web- based content probably disappeared. These findings indicate that the Web-based content mortality rates are high and the shorter lifecycle of a Website urges the Sudan National Library to adopt establishment of a Web archive, as proposed in this study in order to save the country’s Web space for future access. It is recommended to form a Web legal deposit law to authorise copying for building the collection of the proposed Web Archive. Keywords: National Library - Sudan; the Web – archiving and preservation; the Web - Intellectual property; the Web - Legal deposit – Sudan; Webometrics, science of. vi إﺳﱰاﺗﻴﺠﻴﺎت ﺣﻔﻆ ﳏﺘﻮى اﻟﻮﻳﺐ: ﺣﻮاﺟﺰ اﳌﻠﻜﻴﺔ اﻟﻔﻜﺮﻳﺔ اﻟﱵ ﺗﻮاﺟﻪ ﺑﻨﺎء ذاآﺮةاﻟﺴﻮدان ﻟﻠﻮﻳﺐ رﺳﺎﻟﺔ ﻣﻘﺪﻣﺔ ﻟﻨﻴﻞ درﺟﺔ دآﺘﻮراة اﻟﻔﻠﺴﻔﺔ ﰲ ﻋﻠﻢ اﳌﻌﻠﻮﻣﺎت واﳌﻜﺘﺒﺎت إﻋﺪاد: ﺟﻌﻔﺮ ﻋﻠﻲ ﻓﻀﻞ إﺑﺮاهﻴﻢ اﳌﺴﺘﺨﻠﺺ أدت اﻟﺘﻄﻮرات اﳊﺪﻳﺜﺔ ﰲ ﳎﺎل اﻟﻨﺸﺮ ﻋﻠﻰ اﻟﻮﻳـﺐ إﱃ ﻇُﻬـﻮر ﲢﺪﻳﺎتٍ ﺟﺪﻳﺪةٍ ﺗُﻮاﺟﻪ إدارة ﻣﺼﺎدر اﳌﻜﺘﺒﺔ اﻟﻮﻃﻨﻴﺔ آﺎﶈﺘﻮى اﳌﻨﺸﻮر ﻋﻠﻰ اﻟﻮﻳﺐ ﺑﺈ ﻋﺘﺒﺎرﻩ ﺟﺰءاً ﻻ ﻳﺘﺠﺰأ ﻣﻦ ﳎﻤﻮﻋﺎﲥﺎ . ﻓﻬﺬا اﶈُﺘﻮى ﻳﻌﺪ ﻣﻦ أآﺜﺮ اﳌﻮاد هﺸﺎﺷﺔً وﻋﺮﺿﺔً ﻟﻠﻀﻴﺎع إذا ﱂ ﻳﺘﻢ إﻋﺪاد إﺳﱰاﺗﻴﺠﻴﺎت ﳊﻔﻈـﻪ . وﺗُـﺸَﻜِﻞ ﺣـﻮاﺟﺰُ اﳌﻠﻜﻴـﺔ اﻟﻔﻜﺮﻳﺔ ﻋﻘﺒﺔً رﺋﻴﺴﺔً ﳊﻔﻆ هﺬا اﻟﻨﻮع ﻣﻦ اﶈﺘﻮى . ﻳﻬﺪف هﺬا اﻟﺒﺤﺚ إﱃ إﺟﺮاء ﻣﻮازﻧﺔ ﺑﲔ إﺣﺘﻴﺎﺟﺎت اﺳـﱰﺟﺎ ع اﳌﻌﻠﻮﻣـﺎت وﺑﲔ ﲪﺎﻳﺔ ﺣﻘﻮق اﳌﻠﻜﻴﺔ اﻟﻔﻜﺮﻳـﺔ ﺣـﱴ ﳝﻜـﻦ ﺑﻨـﺎء ذاآـﺮة اﻟﺴﻮدان ﻟﻠﻮﻳﺐ . وﰎ اﺧﺘﻴﺎر ﻧﻄـﺎق اﻟـﺴﻮدان ﻋﻠـﻰ اﻟﻮﻳـﺐ آﻨﻘﻄﺔ اﻹرﺗﻜﺎز واﻟﺘﻘﺼﻲ اﻟﻌﻤﻠﻲ . واﺳﺘﺨﺪم ﻣﻨﻬﺠﲔ ﳘﺎ ﻣﻨﻬﺞ اﻟﺒﺤﺚ اﳌﺴﺤﻲ وﻣﻨﻬﺞ اﻟﺒﺤﺚ اﻟﺘﻄﺒﻴﻘﻲ ﺑﻐـﺮض ﲨـﻊ، ﻣﻌﺎﳉـﺔ، ﺟﺪوﻟﺔ، وﲢﻠﻴﻞ اﻟﻌﻴﻨﺔ وﲢﺪﻳﺪ اﻟﺒﻴﺎﻧ ﺎت اﻹﺣﺼﺎﺋﻴﺔ ووﺻـﻔﻬﺎ ﺑﺎﺳــﺘﺨﺪام ﻧﻈــﺎم ﺣــﺰم اﻟﺘﺤﻠﻴــﻞ اﻹﺣــﺼﺎﺋﻲ ﻟﻠﺪراﺳــﺎت اﻻﺟﺘﻤﺎﻋﻴــﺔ. وﻟﻠﻤﻨــﻬﺞ اﳌــﺴﺤﻲ أداﺗــﺎن ﳘــﺎ اﳌﻘﺎﺑﻠــﺔ واﻹﺳــﺘﺒﻴﺎن. ووﻇــﻒ اﳌﻨــﻬﺞ اﻟﺘﻄﺒﻴﻘــﻲ ﻹﳚــﺎد ﺣــﻞ ﳌــﺸﻜﻠﺔ اﻟﺪراﺳﺔ آﻤﺎ أﻋﻴﺪ ﺗﻮﻇﻴﻒ ﻗﻮاﻧﲔ راﳒﺎﻧﺎﺛﺎن ﰲ إﺳﱰاﺗﻴﺠﻴﺎت اﺧﺘﻴﺎر وﺗﻨﻤﻴﺔ اﻤﻮﻋﺎت . وﺗﻨﺪرج هـﺬﻩ اﻟﺪ راﺳـﺔ ﰲ ﺳـﻴﺎق ﳎﺎل اﻟﻮﻳﺒﻮﻣﻴﱰﻳﻜﺎ آﺘﺨﺼﺺ ﺟﺪﻳﺪ ﺿﻤﻦ ﲣﺼﺼﺎت ﻋﻠﻢ اﳌﻌﻠﻮﻣـﺎت واﳌﻜﺘﺒــﺎت. وﺗــﺸﲑ اﻟﻨﺘــﺎﺋﺞ إﱃ أنً ﻣﺆﺷــﺮ ﻣﻌﺎﻣــﻞ أﻟﻔــﺎ ﻟﻼﻋﺘﻤﺎدﻳﺔ ﺑﻠﻎ (0.8) ﳉﻤﻠﺔ اﻟﻌﻨﺎﺻﺮ اﻟﱵ أُﺧﻀﻌﺖ ﻟﻠﺘﺤﻠﻴـﻞ اﻹﺣﺼﺎﺋﻲ. وأنً اﻹرﺗﺒﺎط ﺑﲔ ﺗﻠﻚ اﳌﺘﻐﲑات ﻟﻪ دﻻﻟﺔ إﺣﺼﺎﺋﻴﺔ ﺑﻠﻐﺖ (0.05) وهﻲ ﺗﻮﺿﺢ ﺗﺄﺛﺮ ﻋﻤﻠﻴﺎت اﳊﻔﻆ ﲝﻮاﺟﺰ اﳌﻠﻜﻴـﺔ اﻟﻔﻜﺮﻳﺔ. وﺗﺸﲑ اﻟﻨﺘﺎﺋﺞ إﱃ أنَ ﻧﺴﺒﺔ إﺧﺘﻔﺎء ﳏﺘﻮى اﻟﻮﻳـﺐ ﻗﺪ ﺑﻠﻐﺖ 25% ﰲ ﺛﻼﺛﺔ أﺷﻬﺮ وﺗﺸﲑ ﻧﺘﺎﺋﺞ دراﺳﺎت ﳑﺎﺛﻠـﺔ أنً ﻧﺴﺒﺔ إﺧﺘﻔﺎء ذﻟﻚ اﶈﺘﻮى ﻗﺪ ﺑﻠﻐﺖ 44% ﰲ اﻟﺴﻨﺔ وأنً ﻣﺘﻮﺳﻂ دورة ﺣﻴﺎة ﻣﻮﻗﻊ اﻟﻮﻳﺐ هﻮ 44 ﻳﻮﻣﺎً ﻓﻘـﻂ . ووﺟـﺪ أنً ﻓـﱰة ﺗﻮﻗﻒ ﺧ ﺪﻣﺎت ﻧﻄﺎق اﻟﺴﻮدان ﻋﻠﻰ اﻟﻮﻳﺐ ﻋﻦ اﻟﻌﻤﻞ ﰲ ﻣﻨﺘـﺼﻒ ﻋﺎم 2001م وﻋﻮدﺗـﻪ إﱃ اﳋﺪﻣـﺔ ﰲ ﺎﻳـﺔ ﻋـﺎم 2002م رﲟـﺎ ﺗﺴﺒﺒﺖ ﰲ اﺧﺘﻔﺎء ﻧـﺴﺒﺔ ﻋﺎﻟﻴـﺔ ﻣـﻦ ذﻟـﻚ اﻟـﱰاث اﻟﻘـﻮﻣﻲ اﻟﺮﻗﻤﻲ. إنَ إرﺗﻔﺎع ﻧﺴﺒﺔ إﺧﺘﻔﺎء هﺬا اﻟﻨـﻮع ﻣـﻦ اﶈﺘـﻮى vii وﻗﺼﺮ دورة ﺣﻴﺎﺗﻪ ﺗﺪﻋﻮ ﻣﺘﺨﺬي اﻟﻘﺮار ﺑﺎﳌﻜﺘﺒـﺔ اﻟﻮﻃﻨﻴـﺔ إﱃ ﺑﻨ ﺎء ذاآﺮة اﻟﺴﻮدان ﻟﻠﻮﻳـﺐ، آﻤـﺎ هـﻮ ﻣﻘـﱰح ﰲ هـﺬﻩ اﻟﺪراﺳﺔ، ﳊﻔﻆ ﳏﺘﻮى اﻟﻨﻄـﺎق ﻷﻏـﺮاض اﻟﻨﻔـﺎذ اﳌـﺴﺘﻘﺒﻠﻲ . وﺗﻮﺻﻲ اﻟﺪراﺳﺔ ﺑﺈﻋﺪاد ﻗﺎﻧﻮن ﳊـﻖ اﻹﻳـﺪاع ﳜـﺘﺺ ﺑﺎﻟﻮﻳـﺐ ﻟﻠﺴﻤﺎح ﺑﺘﻨﻤﻴﺔ ﳎﻤﻮﻋﺎت اﻟﺬاآﺮة. آﻠﻤﺎت ﻣﻔﺘﺎﺣﻴﺔ: اﳌﻜﺘﺒـﺔ اﻟﻮﻃﻨﻴـﺔ - اﻟـﺴﻮدان؛ اﻟﻮﻳـﺐ، أرﺷﻔﺔ وﺣﻔﻆ؛ ذاآﺮة اﻟﻮﻳـﺐ _ اﻟـﺴﻮدان؛ ﺻـﻨﺎﻋﺔ اﶈﺘـﻮى؛ اﻟﻮﻳﺐ، ﺣﻘﻮق اﳌﻠﻜﻴﺔ اﻟﻔﻜﺮﻳﺔ؛ اﻟﻮﻳﺒﻮﻣﱰي، ﻋﻠﻢ. viii List of Abbreviations and Acronyms Abbr./ Full form Acro. AARC Anglo-American Cataloguing Rules ADCIA Australian Digital Content Industry Agenda ADI American Documentation Institute AIDS/HIV Acquired Immune Deficiency Syndrome/Human immunodeficiency Virus ARIPO African Regional Intellectual Property Organisation ARPANET Advanced Research Projects Agency Network ASCII American Standard Code for Information Interchange ASIS American Society for Information Science BIRPI Bureaux Internationaux reunis pour la protection de la propriete intellectuelle BNB British National Bibliography BnF Bibliotheque nationale de France CBD Convention on Biological Diversity ccTLDs country code Top-Level Domains CCELP Committee on Coordination of the Electronic Library Project CENL Committee of the Conference of European National Libraries CERN European Organisation for Nuclear Research ix COBRA Computerised Bibliographic Record Actions CRO Central Records Office CSDL Conseil scientifique du dépôt legal CSIC Centro de Informacion y Documentacion Cientifica CWTOA Commission on the World Trade Affairs DAISY Digital Accessible Information System DARPA Defence Advanced Research Projects Agency DCMI Dublin Core Metadata Initiative DCNRC Documentation Centre of the National Research Centre DDA Doha Development Agenda DDC Dewey Decimal Classification DDS Document delivery services DIU Dams Implementations Unit DMCA Digital Millennium Copyright Act DNS Domain name systems DOD Department of Defence DSEP Deposit systems for electronic publications DTB Digital talking books DVD Digital versatile disc EAD Encoded Archival Description eIFL Electronic Information for Libraries EUCEI European Union Court of First Instance FID Institut International de Documentation FIFA Federation Internationale

Strategies for Preservation of Web-Based Content: Intellectual Property Barriers to Building the Sudan Web Archive

Selection in Web Archives: the Value of Archival Best Practices

Australia (PDF: 202KB)

Annual Report to Partners 2019-2020

Country Report [Period Coverage: 1 January 2018 – 31 December 2018]

The Web As History

Annual Report 2019-2020

Reflections on Collection Policies in NSLA Libraries

Is Web Archives a Misnomer – How Web Archives Can Become Digital Archives? in C

Annual Report to Partners 2018-19

The Web As History. Using Web Archives to Understand the Past and the Present 2017