Introduction to Digitization

Introduction to Digitization

IntroductionIntroduction toto Digitization:Digitization: AnAn OverviewOverview JulyJuly 1616thth 2008,2008, FISFIS 2308H2308H AndreaAndrea KosavicKosavic DigitalDigital InitiativesInitiatives Librarian,Librarian, YorkYork UniversityUniversity IntroductionIntroduction toto DigitizationDigitization DigitizationDigitization inin contextcontext WhyWhy digitize?digitize? DigitizationDigitization challengeschallenges DigitizationDigitization ofof imagesimages DigitizationDigitization ofof audioaudio DigitizationDigitization ofof movingmoving imagesimages MetadataMetadata TheThe InuitInuit throughthrough MoravianMoravian EyesEyes DigitizationDigitization inin ContextContext http://www.jisc.ac.uk/media/documents/programmes/preservation/moving_images_and_sound_archiving_study1.pdf WhyWhy Digitize?Digitize? ObsolescenceObsolescence ofof sourcesource devicesdevices (for(for audioaudio andand movingmoving images)images) ContentContent unlockedunlocked fromfrom aa fragilefragile storagestorage andand deliverydelivery formatformat MoreMore convenientconvenient toto deliverdeliver MoreMore easilyeasily accessibleaccessible toto usersusers DoDo notnot dependdepend onon sourcesource devicedevice forfor accessaccess MediaMedia hashas aa limitedlimited lifelife spanspan DigitizationDigitization limitslimits thethe useuse andand handlinghandling ofof originalsoriginals WhyWhy Digitize?Digitize? DigitizedDigitized itemsitems moremore easyeasy toto handlehandle andand manipulatemanipulate DigitalDigital contentcontent cancan bebe copiedcopied withoutwithout lossloss AnalogAnalog formatsformats degradedegrade withwith eacheach useuse andand loselose qualityquality whenwhen copiedcopied CanCan bebe delivereddelivered toto aa farfar reachingreaching audienceaudience overover internetinternet CanCan addadd metadata,metadata, ie.ie. MPEG7MPEG7 allowsallows enhancedenhanced searchingsearching DigitizationDigitization challengeschallenges MultipleMultiple formatsformats toto choosechoose fromfrom Can’tCan’t matchmatch qualityquality toto thatthat ofof thethe sourcesource AnalogAnalog versionversion isis stillstill consideredconsidered thethe preservationpreservation mastermaster copycopy ExpensiveExpensive DigitizationDigitization equipmentequipment StorageStorage StaffStaff timetime DigitizationDigitization challengeschallenges Storage…we’reStorage…we’re talkingtalking TBs!TBs! CDCD qualityquality audioaudio isis 520520 MBMB perper hourhour DVD-qualityDVD-quality videovideo == 1313 GBGB perper hourhour BroadcastBroadcast qualityquality videovideo == 7575 GBGB perper hourhour ((ITU-RITU-R BT.601)BT.601) TechnicalTechnical limitationslimitations CompressionCompression algorithmsalgorithms stillstill evolvingevolving HighHigh bandwidthbandwidth requiredrequired forfor transfertransfer ForFor anan audioaudio filefile recordedrecorded atat preservationpreservation standards,standards, itit takestakes 5x5x thethe durationduration ofof thethe filefile toto transfertransfer overover T1T1 networknetwork DigitizationDigitization ofof ImagesImages IntroductionIntroduction toto variousvarious materialsmaterials TheThe DigitizationDigitization ProcessProcess CommonCommon ImageImage FormatsFormats http://www.wpclipart.com/computer/hardware/scanner.png MultipleMultiple formatformat typestypes MapsMaps Photographs PlansPlans Negatives ManuscriptsManuscripts Microfilm PlainPlain TextText Transparencies Drawings Slides Paintings Charts & graphs FlatbedFlatbed ScannerScanner GoodGood forfor smallersmaller plansplans // maps,maps, photographs,photographs, plainplain texttext AutoAuto SheetSheet FeederFeeder attachmentsattachments allowallow forfor fastfast digitizationdigitization ofof singlesingle sheetssheets ScansScans aa varietyvariety ofof resolutionsresolutions 200200 dpidpi –– 9600+9600+ dpidpi ScansScans atat 11 bitbit (black(black andand white),white), 88 bitbit (grayscale),(grayscale), andand 2424 oror 4848 bitbit (colour)(colour) Image: http://content.answers.com/main/content/img/CDE/_CREOSCN.JPG FlatbedFlatbed ScannerScanner TipsTips ScanScan plainplain blackblack andand whitewhite texttext atat 11 bit,bit, thisthis avoidsavoids greygrey backgroundbackground ScanScan blackblack andand whitewhite drawingsdrawings withwith shadingshading atat 88 bit,bit, oror 11 bitbit withwith half-toninghalf-toning ScanningScanning colourcolour imagesimages withwith texttext isis difficult,difficult, ifif scanningscanning atat 2424 bit,bit, texttext qualityquality willwill suffer,suffer, willwill havehave toto playplay withwith settingssettings oror scanscan separatelyseparately DigitalDigital CameraCamera Images: http://www.digital-photography.org/CruseGmbHdigitalscannersystem/Cruse_repro-stand_copystand.htm DigitalDigital CameraCamera –– BookBook CradleCradle CanCan bebe usedused withwith aa bookbook cradlecradle BookBook cradlecradle keepskeeps pagespages flatflat withoutwithout damagingdamaging bookbook BookBook cradlecradle necessarynecessary forfor rarerare manuscriptsmanuscripts IdealIdeal forfor maps,maps, plans,plans, manuscripts,manuscripts, drawingsdrawings Image: paintings http://www.i2s-bookscanner.com/visualisationMiniature.asp? paintings image=upload/produits/gammes/acc_BC1590.gif SpecializedSpecialized ScannerScanner TypesTypes MicrofilmMicrofilm scannerscanner SpecializedSpecialized forfor microfilmmicrofilm Slide/NegativeSlide/Negative scannerscanner HigherHigher resolutionresolution capturecapture ComeCome withwith specializedspecialized cartridgescartridges toto holdhold differentdifferent sizessizes ofof filmfilm PhotoPhoto scannerscanner HigherHigher resolutionresolution capturecapture Images: http://www.solar-imaging.com/digital-universal.html (top) http://www.bearclover.net/epson-scanner/silverfast.html (right) AutomatedAutomated BookBook ScannerScanner 1200 pages per hour Must be supervised Used by Google and Internet Archive proJects for books Not suitable for rare or fragile materials Does not create preservation grade images (JPEGs) Image: http://www.kirtas-tech.com/uploads/images/APTFrontPage.jpg TargetsTargets forfor scanningscanning http://www.imagequality.com/dtp/images/elec.it8.refl.jpg TargetsTargets forfor scanningscanning ManyMany differentdifferent sizessizes andand typestypes availableavailable ScannedScanned withwith imageimage HelpHelp toto calibratecalibrate colourcolour balancebalance forfor scanscan UseUse scanningscanning softwaresoftware toto createcreate whitewhite andand blackblack calibrationcalibration withwith targettarget forfor eacheach scanscan SavedSaved withwith archivalarchival digitaldigital mastermaster DerivativesDerivatives areare usuallyusually mademade withwith thethe targettarget croppedcropped outout Image: http://www-rcf.usc.edu/~gainer/impa/imaging/kodak_q_60_example.jpg ImageImage ProcessingProcessing De-skewDe-skew De-speckleDe-speckle ReduceReduce backgroundbackground RotationRotation RegisterRegister WarningWarning OnlyOnly de-specklede-speckle andand reducereduce backgroundbackground onon imagesimages ifif absolutelyabsolutely necessarynecessary ProcessingProcessing oftenoften resultsresults inin imageimage qualityquality lossloss OCROCR NotesNotes andand RecommendationsRecommendations DoDo notnot compresscompress TIFFs,TIFFs, incompatibleincompatible withwith somesome OCROCR programsprograms AdjustAdjust brightnessbrightness andand contrastcontrast soso thatthat texttext isis asas darkdark asas possiblepossible andand backgroundbackground isis asas lightlight asas possiblepossible (using(using aa copycopy ofof original)original) SkewSkew inin texttext willwill compromisecompromise OCROCR OCROCR tendstends toto bebe lessless reliablereliable withwith headingsheadings OCR tends to not be corrected OCR tends to not be corrected OCROCR NotesNotes andand RecommendationsRecommendations RequireRequire specialspecial ‘zoning’‘zoning’ algorithmsalgorithms forfor texttext inin columncolumn format,format, ie.ie. magazinesmagazines SomeSome OCROCR programsprograms havehave aa maximummaximum pixelpixel widthwidth ofof filefile OCROCR willwill notnot recognizerecognize handwrittenhandwritten scriptscript SpecialSpecial OCROCR programsprograms areare availableavailable forfor GothicGothic scriptscript ie.ie. ABBYYABBYY FineReader7FineReader7 SampleSample ImagingImaging RequirementsRequirements http://www.library.cornell.edu/imls/image%20deposit%20guidelines.pdf SampleSample ImagingImaging RequirementsRequirements cont’dcont’d http://www.library.cornell.edu/imls/image%20deposit%20guidelines.pdf ScanningScanning FormatsFormats DigitalDigital MasterMaster TIFFTIFF formatformat ResolutionResolution ofof 600600 dpi/ppidpi/ppi widelywidely adoptedadopted forfor mostmost materialsmaterials LowerLower resolutionsresolutions maymay bebe usedused toto keepkeep filefile sizessizes downdown forfor materialsmaterials suchsuch asas mapsmaps BitBit depthdepth dependsdepends onon typetype ofof materialmaterial WebWeb DeliveryDelivery JPEG,JPEG, JPEGJPEG 20002000 GIFGIF onlyonly capturescaptures 256256 colourscolours DigitizationDigitization ofof AudioAudio IntroductionIntroduction toto variousvarious mediamedia typestypes TheThe DigitizationDigitization ProcessProcess AudioAudio FormatsFormats Image: http://www.addclasses.com/file.php/1/1earphone5-med.jpg WaxWax oror CelluloidCelluloid CylindersCylinders 1890s1890s && 1900s,1900s, upup toto 5”5” diameter,diameter, 2-42-4 minutesminutes playingplaying timetime SourceSource devicedevice

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    102 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us