Harvesting Using the Open Archives Initiative Protocol: What Can Your OAI Stream Tell You?
Sandra McIntyre, MWDL Director Anna Neatrour, MWDL Digital Metadata Librarian The basics WHY OAI? Open Archives Initiative
Open Archives Initiative http://openarchives.org “Standards for Web Content Interoperability”
• Facilitate the efficient dissemination of content contained in archives/repositories • Low-barrier framework and standards Why is a protocol necessary?
“Give me...” “I want it.” “I have it.”
OAI Harvester OAI Provider
“Here is what you requested.” OAI-PMH
Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) http://www.openarchives.org/pmh/ OAI Providers OAI Providers OAI Harvesters
Mountain West Digital Library http://mwdl.org OAIster
http://oaister.worldcat.org and included in WorldCat
Institute of Museum & Library Services Digital Public Library Digital Collections and Content of America http://imlsdcc.grainger.uiuc.edu http://dp.la/
...and thousands more Harvesting at MWDL
Univ of Utah Dvsn Univ of Nevada Arts & Idaho Nevada Las State Utah Reno Museums Arizona Vegas Archives State Memory Library Project Utah State Snow Archives College Salt Lake Northern Comm. Arizona College Univ Weber Univ of State Idaho Univ
Utah Family State Search Univ
Utah LDS Church Valley History Univ
Southern Montana Utah Memory Univ Project
Stacks BYU (Idaho)
Univ of Mountain Boise State Utah West Univ. Digital Library Why understand OAI?
• Predict what will happen with your metadata when it is harvested • Do self-auditing and/or peer auditing of metadata: See patterns and find errors
Other metadata harvesting options
• Handing over a hard drive • Uploading/downloading via file transfer protocol (FTP) • Other requests of XML (typically application programming interfaces, APIs): – Web Services – X-Services Advantages of OAI
• Updates at a distance, anytime • Delivers specified records – By collection (“set”) – By date range of last change to record • Uses Internet, sending packets • Works fast • Repeatable, standard, validatable
Testing an OAI Provider http://re.cs.uct.ac.za/ Queries and responses THE PROTOCOL Queries and Responses
OAI query
OAI OAI Harvester Provider
OAI response Queries: OAI query OAI BaseURL baseURL
BaseURL = OAI provider root address (Doesn’t work alone)
Examples: • http://aura.abdn.ac.uk/dspace-oai/request • http://absronline.org/journals/index.php/index/oai • http://cyberleninka.ru/oai • http://digitalcommons.usu.edu/cgi/oai2.cgi • http://azmemory.azlibrary.gov/oai/oai.php
Queries: OAI query 6 Verbs verb
Verb = type of request Initial capitals; no spaces Examples: • Identify • ListMetadataFormats • ListSets • ListIdentifiers • ListRecords • GetRecord
Queries: OAI query Parameters parameter = value Parameters = details about request Format: [parameter]=[value]
Examples: • metadataPrefix=oai_dc • metadataPrefix=qdc • set=awhof • identifier=oai:content.lib.utah.edu:etd3/482 • from=1999-01-01 • until=2013-12-31
Queries: OAI query Putting it together
Syntax:
baseURL ?verb= verb & parameter = value &
parameter value parameter value = & = (etc.)
For example: http ://azmemory.azlibrary.gov/oai/oai .php?verb=ListRecords&metadataPrefix= oai_dc&set=aho&from=1999-01-01 Queries you can use EXAMPLES Identify
“Who are you?” http://contentdm.li.suu.edu/oai/oai.php?verb=Identify OAI query
OAI OAI Harvester Provider “I am the SUU CONTENTdm Server Repository.”
OAI response Identify
“I am the SUU CONTENTdm Repository.” ListSets
“What sets do you have available?” http://contentdm.li.suu.edu/oai/oai.php?verb=ListSets
OAI query
OAI OAI Harvester Provider
“Here is the list of sets.”
OAI response ListSets “Here’s the list of sets.” ListMetadataFormats
“What metadata formats are available?” http://contentdm.li.suu.edu/oai/oai.php?verb=ListMetadataFormats
OAI query
OAI OAI Harvester Provider
“Here’s the list of metadata formats.”
OAI response ListMetadataFormats
“Here’s the list of metadata formats.” ListRecords “Give me the metadata for all records in qualified Dublin Core.” http://contentdm.li.suu.edu/oai/oai.php?verb=ListRecords& metadataPrefix=oai_qdc OAI query
OAI OAI Harvester Provider
“Here are the records.”
OAI response ListRecords
“Here are the records.” ListRecords
• “Give me only the collection (set) of Iron County Historical Photographs.” http://contentdm.li.suu.edu/oai/oai.php ?verb=ListRecords&metadataPrefix=oai_qd c&set=hist_photos • “Give me the next 200 records.” http://contentdm.li.suu.edu/oai/oai.php ?verb=ListRecords&resumptionToken=hist_ photos:200:hist_photos:0000-00-00:9999- 99-99:oai_qdc GetRecord
• “Give me one specific record.” http://contentdm.li.suu.edu/oai/oai.ph p?verb=GetRecord&metadataPrefix =oai_qdc&identifier=oai:contentdm.li. suu.edu:hist_photos/0
CONTENTdm’s OAI Provider
• Turning on OAI: Administrative interface in the “Server” tab • Choosing which collections to share • Sharing compound object level metadata only
Image from CONTENTdm OAI guide: http://contentdm.org/help6/server-admin/oai.asp Record -> OAI
Local Record with Labels OAI OAI -> MWDL
OAI MWDL MWDL -> DPLA
MWDL DPLA Some Final Things to Remember
Check your own OAI stream and see what it looks like! • Mapped to none – not in OAI stream • Hidden set to yes – not in OAI stream • CONTENTdm Field Properties Template and guide available at: http://mwdl.org/getinvolved/getinvolved.php • Login to collection admin, click on “collections” tab, go to fields to check and edit properties
Field Mappings in CONTENTdm
Field Mapping example from the Western Soundscape Archive Try it yourself!
Resources available at http://mwdl.org/getinvolved/getinvolved.php
We’re here to help!
• For additional questions about self-auditing your OAI contact Anna Neatrour: – [email protected] – 801-587-8883
• Any questions?