Harvesting Using the Open Archives Initiative Protocol: What Can Your OAI Stream Tell You?

Sandra McIntyre, MWDL Director Anna Neatrour, MWDL Digital Librarian The basics WHY OAI? Open Archives Initiative

Open Archives Initiative http://openarchives.org “Standards for Web Content Interoperability”

• Facilitate the efficient dissemination of content contained in archives/repositories • Low-barrier framework and standards Why is a protocol necessary?

“Give me...” “I want it.” “I have it.”

OAI Harvester OAI Provider

“Here is what you requested.” OAI-PMH

Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) http://www.openarchives.org/pmh/ OAI Providers OAI Providers OAI Harvesters

Mountain West http://mwdl.org OAIster

http://oaister.worldcat.org and included in WorldCat

Institute of Museum & Library Services Digital Public Library Digital Collections and Content of America http://imlsdcc.grainger.uiuc.edu http://dp.la/

...and thousands more Harvesting at MWDL

Univ of Utah Dvsn Univ of Nevada Arts & Idaho Nevada Las State Utah Reno Museums Arizona Vegas Archives State Memory Library Project Utah State Snow Archives College Salt Lake Northern Comm. Arizona College Univ Weber Univ of State Idaho Univ

Utah Family State Search Univ

Utah LDS Church Valley History Univ

Southern Montana Utah Memory Univ Project

Stacks BYU (Idaho)

Univ of Mountain Boise State Utah West Univ. Digital Library Why understand OAI?

• Predict what will happen with your metadata when it is harvested • Do self-auditing and/or peer auditing of metadata: See patterns and find errors

Other metadata harvesting options

• Handing over a hard drive • Uploading/downloading via file transfer protocol (FTP) • Other requests of XML (typically application programming interfaces, APIs): – Web Services – X-Services Advantages of OAI

• Updates at a distance, anytime • Delivers specified records – By collection (“set”) – By date range of last change to record • Uses Internet, sending packets • Works fast • Repeatable, standard, validatable

Testing an OAI Provider http://re.cs.uct.ac.za/ Queries and responses THE PROTOCOL Queries and Responses

OAI query

OAI OAI Harvester Provider

OAI response Queries: OAI query OAI BaseURL baseURL

BaseURL = OAI provider root address (Doesn’t work alone)

Examples: • http://aura.abdn.ac.uk/dspace-oai/request • http://absronline.org/journals/index.php/index/oai • http://cyberleninka.ru/oai • http://digitalcommons.usu.edu/cgi/oai2.cgi • http://azmemory.azlibrary.gov/oai/oai.php

Queries: OAI query 6 Verbs verb

Verb = type of request Initial capitals; no spaces Examples: • Identify • ListMetadataFormats • ListSets • ListIdentifiers • ListRecords • GetRecord

Queries: OAI query Parameters parameter = value Parameters = details about request Format: [parameter]=[value]

Examples: • metadataPrefix=oai_dc • metadataPrefix=qdc • set=awhof • identifier=oai:content.lib.utah.edu:etd3/482 • from=1999-01-01 • until=2013-12-31

Queries: OAI query Putting it together

Syntax:

baseURL ?verb= verb & parameter = value &

parameter value parameter value = & = (etc.)

For example: http ://azmemory.azlibrary.gov/oai/oai .php?verb=ListRecords&metadataPrefix= oai_dc&set=aho&from=1999-01-01 Queries you can use EXAMPLES Identify

“Who are you?” http://contentdm.li.suu.edu/oai/oai.php?verb=Identify OAI query

OAI OAI Harvester Provider “I am the SUU CONTENTdm Server Repository.”

OAI response Identify

“I am the SUU CONTENTdm Repository.” ListSets

“What sets do you have available?” http://contentdm.li.suu.edu/oai/oai.php?verb=ListSets

OAI query

OAI OAI Harvester Provider

“Here is the list of sets.”

OAI response ListSets “Here’s the list of sets.” ListMetadataFormats

“What metadata formats are available?” http://contentdm.li.suu.edu/oai/oai.php?verb=ListMetadataFormats

OAI query

OAI OAI Harvester Provider

“Here’s the list of metadata formats.”

OAI response ListMetadataFormats

“Here’s the list of metadata formats.” ListRecords “Give me the metadata for all records in qualified .” http://contentdm.li.suu.edu/oai/oai.php?verb=ListRecords& metadataPrefix=oai_qdc OAI query

OAI OAI Harvester Provider

“Here are the records.”

OAI response ListRecords

“Here are the records.” ListRecords

• “Give me only the collection (set) of Iron County Historical Photographs.” http://contentdm.li.suu.edu/oai/oai.php ?verb=ListRecords&metadataPrefix=oai_qd c&set=hist_photos • “Give me the next 200 records.” http://contentdm.li.suu.edu/oai/oai.php ?verb=ListRecords&resumptionToken=hist_ photos:200:hist_photos:0000-00-00:9999- 99-99:oai_qdc GetRecord

• “Give me one specific record.” http://contentdm.li.suu.edu/oai/oai.ph p?verb=GetRecord&metadataPrefix =oai_qdc&identifier=oai:contentdm.li. suu.edu:hist_photos/0

CONTENTdm’s OAI Provider

• Turning on OAI: Administrative interface in the “Server” tab • Choosing which collections to share • Sharing compound object level metadata only

Image from CONTENTdm OAI guide: http://contentdm.org/help6/server-admin/oai.asp Record -> OAI

Local Record with Labels OAI OAI -> MWDL

OAI MWDL MWDL -> DPLA

MWDL DPLA Some Final Things to Remember

Check your own OAI stream and see what it looks like! • Mapped to none – not in OAI stream • Hidden set to yes – not in OAI stream • CONTENTdm Field Properties Template and guide available at: http://mwdl.org/getinvolved/getinvolved.php • Login to collection admin, click on “collections” tab, go to fields to check and edit properties

Field Mappings in CONTENTdm

Field Mapping example from the Western Soundscape Archive Try it yourself!

Resources available at http://mwdl.org/getinvolved/getinvolved.php

We’re here to help!

• For additional questions about self-auditing your OAI contact Anna Neatrour: – [email protected] – 801-587-8883

• Any questions?