The Many Faces of Semantics
information, and instead allows arbitrary mix- ing and reuse of information by applications. For example, an ethnomusicological archive Interlinking might benefit from being linked to a geograph- ical data set such as GeoNames (see http:// geonames.org). In this way, an archive could be tightly focused on its primary topic and Music-Related leave the burden of ancillary descriptions to other focused data sets. In the same web of data, we could publish items corresponding to the potential output Data on the Web of a music-analysis algorithm. Such results could then be reused for further research. In this way, a research group publishing a new al- Yves Raimond gorithm could leave the burden of computing BBC Audio & Music Interactive its supporting data to other algorithms pub- lished by other groups. In this article, we de- Christopher Sutton scribe our efforts toward building such a web Intrasonics of data for music-related information.
Mark Sandler Toward a web of data Queen Mary, University of London The need to make currently published infor- mation on multimedia resources available in a nformation management is an impor- common, structured, interlinked format is a This article describes tant part of multimedia, covering the topic frequently discussed in this publication. how Semantic Web administration of public and personal Tim Berners-Lee’s vision of the Semantic technologies can be collections, the construction of large Web,1 and the vast array of technologies al- used to interlink editorialI databases, and the storage of analysis ready established in pursuit of it, provide the musical data sources results. Applications for each of these aspects functionality required to begin building such that have traditionally of multimedia management have emerged, a web of data. This section provides a brief over- been isolated and with notable examples being Greenstone (see view of the technologies currently being used. difficult to integrate. http://www.greenstone.org) for digital libraries, iTunes for personal media-collection manage- Identifiers and descriptions ment, MusicBrainz (see http://musicbrainz.org) The W3C’s Resource Description Framework, for classification data, and traditional relational or RDF, (see http://www.w3.org/RDF) allows databases for managing analysis results. the description of resources by expressing state- However, despite the ability of these applica- ments about them in the form of triples: sub- tions to work with different facets of multi- ject, predicate, and object. Each element of media information, they are typically isolated such a triple is specified by a uniform resource from one another. Sharing and reusing data, identifier (URI). A set of triples can be inter- even between instances of the same tool, is preted as a graph of these resources, with arcs difficult and often involves manual effort for corresponding to the relationships between one-time data migration. Common data for- them. mats reduce the need for such efforts, but re- RDF alone provides a common, structured strict the expressivity of the applications’ data. format for expressing data. Interlinking data The problem becomes more difficult if we sets can be achieved by ensuring URIs are extend our range of interest to data that can unique across data sets, and providing a com- be produced, for example, by audio-analysis mon access mechanism for following references algorithms that provide higher-level represen- between data sets. In practice, HTTP proves tations than the audio signals themselves. ideal for this task. If each resource is identified A promising solution to such problems is to by an HTTP URI (such as http://example.com/ take a data-oriented rather than an application- resource7341), we gain an established system oriented view, with a web of data that doesn’t for ownership of URIs, and we can traverse limit the formats or ontologies used to record data sets using simple HTTP GET operations.
52 1070-986X/09/$25.00 c 2009 IEEE Published by the IEEE Computer Society A user agent that wishes to know more automatically dereferences resources if they are about a resource x dereferences the URI of x semantically marked ‘‘the same as’’ or ‘‘see by performing an HTTP GET operation on the also’’ from a retrieved resource, then provides URI address, and receives RDF data containing the user with a view of all the information triples related to the resource. Doing so allows found. dynamic exploration of linked data sets that Semantic Web user agents can also express can be distributed across geographic locations, complex queries directly to remote data pub- institutions, and owners, much like documents lishers. One example is using SPARQL Protocol on the traditional Web.2 and RDF Query Language, known as SPARQL This idea of using HTTP addresses to provide (see http://www.w3.org/TR/rdf-sparql-query), machine-processable resource descriptions might to submit a query to a publisher’s SPARQL end- seem at odds with the current use of the HTTP point over HTTP. The query language is SQL- namespace. However, various techniques (such like, and allows requests ranging from simple as specification of content type by user agents; describes (‘‘return all information about content negotiation by 303 redirects; and resource x’’) to complex queries about the end- embedded microformats, or RDFa) allow the point’s database (‘‘return the latest album from same HTTP address to provide both machine- each artist who had an album in the US charts processable data and human-readable HTML in the 70s’’). These queries are typically issued data describing a resource. The Semantic Web to a single endpoint, but there is ongoing can therefore be built alongside the current research into efficient mechanisms for stream- Web, and a content publisher with knowledge of lined querying of multiple endpoints (see Semantic Web technologies can ensure the pub- http://darq.sourceforge.net). lisheddataisusefultoahumanreaderviaatradi- Because Semantic Web ontologies (identi- tional Web browser and to a Semantic Web user fying important concepts and relations in a agent performing data integration, reasoning, particular domain) are themselves part of the and deduction on behalf of a human user. web of data, domain-specific user agents might encounter new ontologies. By reasoning Semantic Web user agents on their relationships to known ontologies, The term user agent describes any software these domain-specific user agents can handle acting directly on user requests. A Semantic data expressed using those ontologies. Web user agent is one that accesses resources on the Semantic Web to satisfy a user’s Music-related web of data demands. One example of a Semantic Web There is a vast amount of music-related data user agent would be a simple browser, analo- currently online, some of it provided without gous to a modern Web browser, that allows restrictions (such as through the MusicBrainz the user to navigate data on the Semantic Web database, FreeDB CD listings, the MusicMoz di- just as a Web browser allows a user to navigate rectory, Wikipedia articles, and the Jamendo Web sites. and Magnatune labels) and some of it provided Although we are beginning to see quite so- with copyright restrictions (such as through the phisticated uses of Web resources—such as All Music Guide, Gracenote, Amazon, and scripts that modify Web page content on the iTunes Music Store). Although interlinking be- fly and mash-ups that dynamically combine tween these resources would benefit all con- the functionality of multiple sites—considerable cerned, each data source instead uses its own effort has gone toward working around the fact identifiers, data formats, and APIs. that the traditional Web is designed for docu- Providing unfettered access to the data is a ments rather than data. The Semantic Web, on first step toward flexible integration,4 but the other hand, is designed from the outset to doing so necessitates writing code to combine allow much more complex interaction with data sources (for example, a mash-up that April available data sources, so the term Semantic uses your Last.FM, see http://www.last.fm, lis- Web user agent encompasses more complex tening profile to plot your recently heard artists 2009 June modes of data consumption, including pro- on a map). In addition, new code must be writ- grams that automatically explore and derefer- ten for each desired combination. If this data ence extra resources to satisfy a user’s query. were instead integrated into the Web, such A simple example is the Tabulator,3 which code would be unnecessary, and a generic
53 mo:compose No single ontology could hope to cover mo:MusicArtist mo:Composition the requirements of all music descriptions.6 The Music Ontology, like any ontology that mo:produced_work provides URIs for its terms, is designed to be extended with specialized ontologies. For exam- mo:MusicalWork ple, the ontology itself provides only basic mo:performance_of instrument and genre terms, but can be extended by using the Simple Knowledge Orga- nization System adaptation of the MusicBrainz instrument taxonomy (see http://purl.org/ mo:produced_sound ontology/mo/mit) and the DBpedia7 adaptation mo:Performance mo:Sound of Wikipedia’s genre taxonomy. In addition, mo:recorded_as mo:recorded_in some more complex extensions are available, dealing with chords and symbolic music nota- tion (see http://purl.org/ontology/chord and
mo:Recording mo:Signal mo:Record http://purl.org/ontology/symbolic-music). mo:produced_signal mo:published_as Linking open data The open-data movement aims to make data freely available to everyone. We contribute to the Linking Open Data on the Semantic Web Figure 1. Describing a user interface could allow arbitrary reuse and community project,8 which aims to interlink music production new data combinations. such open sources of information using the process using level 2 of technologies described previously. For exam- the Music Ontology. Music Ontology overview ple, when providing a description of a particu- Integration and interlinking data sources is lar artist in the DBTune project (see http:// possible even when they don’t share a common dbtune.org), we link the artist resource to a ontology, but is much easier when they do. To- location in the GeoNames data set, which pro- ward this end, we contributed to the design of vides additional knowledge about this location, 5 the Music Ontology, which provides a stan- instead of providing a complete geographic dard base ontology for describing musical infor- description ourselves. An agent crawling the mation. Currently, it can describe a wide range Semantic Web can jump from our knowledge of music information at three levels of detail: base to the GeoNames one by following the link. level 1 describes top-level editorial informa- The Music Ontology helps this process tion, such as the data found in an ID3 tag; for music-related information, providing a framework for publishing heterogeneous level 2 describes the process behind the pro- music-related content in RDF. Moreover, as duction of music, whether in the studio, on mentioned previously, the Music Ontology a home PC, or in concert; and can be extended with other ontologies to cover additional domains. For example, we level 3 describes the structure and compo- can use the Music Ontology alongside ontolo- nent events of the music being played, gies for reviews, social networking, or geo- such as the notes, chords, or samples. graphic information, without having to work around any of the traditional forced boundaries The Music Ontology is interlinked with other between domains. ontologies, most notably Functional Require- Members of the linked-data community ments for Bibliographic Records, the Timeline have published and interlinked several music- and Event ontologies, and the Friend-of-a- related data sets in this web of data. To date, Friend (FOAF) ontology. Figure 1 depicts the these data sets include MusicBrainz, DBpedia, main concepts in level 2, while the ‘‘Music and the Jamendo and Magnatune labels, pub- Ontology Example’’ sidebar contains an exam- lished within our DBTune project. In addition,
IEEE MultiMedia ple track description. several BBC data sets have been published.
54 Music Ontology Example The Music Ontology allows the descrip- Friend of a friend tion of a wide range of musical data. Below is one such description that details an artist composing a musical work that is Jonathan RIT’s Surround performed in a studio and recorded as an Coulton Sound album track. Such a description could be handcrafted by a fan, extracted from an ed- Composition itorial database, or even automatically com- of Code Monkey piled by the music production tools used by the artist in creating his or her work. @prefix :
rdfs:type mo:MusicArtist ; Studio performance by Live performance by foaf:name ; Jonathan Coulton Acappella ‘‘Jonathan Coulton’’ Jonathan Coulton arrangement owl:sameAs
55 automatic linking from a personal-audio collec- tion to the MusicBrainz data set. It uses audio fingerprinting and available ID3 tags to find corresponding identifiers, then outputs RDF statements to make the links between local audio files and the remote manifestation identi- fiers. GNAT can be used to build small applica- tions, such as for plotting an audio collection on a timeline to generate playlists of songs com- posed during a particular decade, or plotting an audio collection on a map to create playlists
User Via according to geographical data. interaction SPARQL We are using our GNARQL tool in the same project to explore some of these application pos- sibilities by loading the data from GNAT, then GNAT GNARQL crawling the Semantic Web for more information about the user’s audio collection. We see GNARQL as a Semantic Web version of the Ex- Identify 10 Load pose´ tool. With the data sets currently available Aggregate and interlinked, GNARQL can answer query additional information requests such as ‘‘create a playlist of performances of works by German composers, written between RDF RDF RDF 1800 and 1850’’ or ‘‘find rock bands from the RDF 1970s that have more than five tribute bands.’’ RDF Interlinked data With only a little further work, we expect RDF RDF on the Web GNARQLwillbeabletohandlemoreuseful Web identifiers queries, such as ‘‘find gigs by artists similar to Audio collection for tracks my most-played artists that fit with my vaca- tion plan.’’ The following is an example of Figure 2. Management Managing music collections such a query for SPARQL: of personal music Personal music collections can be part of such PREFIX geo:
IEEE MultiMedia motools). GNAT is an implementation of location. As shown in Figure 2, we can build
56 Notation3 We use Notation31 in all our code snippets. Each block correspond to the identifier rdf:type. The keyword <¼ corresponds to a set of statements (subject, predicate, and corresponds to the inverse property of log:implies. object) about one subject. Web identifiers are either be- tween angle brackets or in a prefix:name notation. Univer- Reference sally quantified variables start with ?.Wedenoteasetof 1. T. Berners-Lee et al., ‘‘N3Logic : A Logical Framework statements describing an existentially quantified variable for the World Wide Web. Theory and Practice of Logic with square brackets. Curly brackets denote a literal resource Programming,’’ to be published in Theory and Practice of corresponding to a particular RDF graph. The keywords a Logic Programming (TPLP); http://arxiv.org/abs/0711.1533. user interfaces on top of the SPARQL end- Modeling analysis algorithms point that GNARQL provides. In the this case We consider every analysis algorithm as an we are using the /Facet browsing interface11 to RDF property, associating the results of an anal- interact with the data aggregated by GNARQL ysis with the inputs and parameters used. For and plot the artists in a user’s collection on example, we would describe a deterministic in- a map. strument classifier and a content-based similar- ity measure between two audio signals as Dynamic resources follows (see the ‘‘Notation3’’ sidebar for more So far, we have discussed only static resources, information about the code): but the Semantic Web interface can be used to mt:instrument access dynamic resources that are computed a rdf:Property; only when requested (and possibly then cached for future requests). In the music realm, this a owl:FunctionalProperty; means current research algorithms12 for tempo rdfs:domain mo:Signal; and rhythm estimation, harmonic analysis, par- rdfs:range mo:Instrument; tial transcription, or source separation could be rdfs:label ‘‘instrument’’; exposed as Semantic Web resources. Doing so rdfs:comment ‘‘‘‘‘‘ would be of great benefit to researchers, allowing Tries to determine the musical them to more easily compare or build upon instruments involved in the cre- others’ algorithms. It would also benefit the gen- ation of an audio signal. eral public by letting them use research algo- ‘‘‘‘‘‘; . rithms without requiring each researcher to mt:similarity design end-user applications. In these cases, the Semantic Web would act as a processing web as a rdf:Property; well as a data web, providing a possible answer a owl:FunctionalProperty; to concerns expressed earlier in this publication.13 rdfs:domain rdf:List; Because automated analysis tasks may require rdfs:range xsd:oat; a significant amount of computation, care must rdfs:label ‘‘similarity’’; be taken to avoid wasted effort. We can immedi- rdfs:comment ‘‘‘‘‘‘ ately discard the approach of precomputing all Computes a similarity measure, information on all known resources. In addi- between 0 and 1, given two signals tion, because each known resource might have as an input. a wide range of computable information, com- ‘‘‘‘‘‘; puting all information about a particular re- . source when its identifier is dereferenced is We consider these predicates built-in within unlikely to be an acceptable approach. Our ourparticularRDFstoreimplementation. April approach is to expose algorithms as part of the When queried in the right mode (with all data web in a nonwasteful manner. We use
input arguments and parameters bound), a 2009 June two examples to illustrate this approach. The computation binds the output arguments to first is a toy example corresponding to a trained the corresponding results. We can use these instrument classifier, while the second is a predicates to describe the analysis steps that content-based similarity service. produced a particular result.
57 Consider the following axiom—such resource the client retrieves is the actual result. an axiom could be accessed as part of the repre- This process leads to the following RDF doc- sentation of a Web resource—which derives ument, accessed via a HTTP GET on ex: a higher-level interpretation of the mt: advert1: similarity predicate: ex:advert1 owl:sameAs mit:Trumpet. {?signal1 mo:similar to ?signal2} ex:signal mo:instrument mit:Trumpet. <¼ { ?signal1 a mo:Signal. ?signal2 a When a user agent requests ex:advert2, the mo:Signal. built-in mt:similarity analysis computes a (?signal1 ?signal2) mt:similarity content-based distance between the ex: ?distance. signal audio signal and every other signal ?distance math:lessThan we know about. Then, this distance is thre- "0.8"^^xsd:oat sholded according to the previous axiom to }. give the following RDF statements:
If we put a Web server on top of our built-in ex:advert2 owl:sameAs ex:signal23. predicates and this derivation rule, a request for ex:signal mo:similar_to ex:signal23. the description of any accessible signal will dy- ex:signal mo:similar_to ex:signal36. namically derive statements holding informa- ex:signal mo:similar_to ex:signal42. tion about the musical instrument used and about which signals are similar. Because we The ex:advert2 resource is the same as one might have a large number of such built-in of the similar signals. It is inconsequential predicates,wemusttakecaretoensurethat which signal is chosen, though results should we don’t waste computation effort. We detail be consistent across multiple requests to the one approach to avoiding waste below. service. By using such a mechanism, only the com- Advertising dynamic resources putation that the user agent is interested in We developed a simple approach to expos- will be triggered. If the user agent looks for an ing algorithms on the Semantic Web. It’s com- instrument associated with a signal, it will trig- patible with existing user agents while avoiding ger a call only to mt:instrument.Ifitlooksfor wasteful computation. We begin by publishing similar signals, it will trigger a call only to only advertisement statements for the compu- mt:similarity. Where appropriate, we can tation axioms, providing user agents with cache the results of these underlying built-in a URI that, when dereferenced, will trigger a predicates for later use. unique class of computations. The property we used for these advertisement links is the Content-based similarity property they advertise (mt:instrument and We adapted a content-based similarity mea- mo:similar_to, in our example). sure developed for the playlist-generation For instance, an end-point providing such a tool SoundBite (see http://www.isophonics.net/ mechanism on top of the axioms mentioned SoundBite) to use for SBSimilarity (see http:// earlier will issue just two statements, when a www.isophonics.net/SBSimilarity). We started description of an accessible signal ex:signal with a database consisting of 300,000 tracks, is requested: and incorporated it into a SPARQL endpoint ex:signal mt:instrument ex:advert1. built using Jena and Joseki (see http://jena.sf. ex:signal mo:similar_to ex:advert2. net). With precomputed features for each track, we can quickly perform similarity searches across Then, when a user agent requests ex:advert1, thedatabasebycomputingthedistancebetween the built-in mt:instrument analysis is trig- each track and the requested track. gered and we append the resulting statements The system performs approximately to the returned description. 300,000 simple distance calculations to sat- To make the process transparent for the cli- isfy each similarity request. We used derefer- ent, we state that ex:advert1 is the same enceable identifiers for the tracks from the
IEEE MultiMedia as the matching output. The advertisement MusicBrainz database, and set up URIs
58 using URISpace (see http://code.google.com/ p/km-rdf). Doing so provided on-demand computation of between-track similarity mea- sures, with track descriptions linking to basic editorial metadata along with audio previews and purchase pages on the Amazon music store. We built a simple interface that ties in iTunes, illustrated in Figure 3. Users can browse the Semantic Web to find more infor- mation about the tracks, preview their audio, and perhaps choose to buy them from Amazon. This user interface demonstrates how an audio collection can easily provide entry points for exploring the growing body of data on the Semantic Web. It also shows how computation can be hidden behind Semantic Web resources to provide user agents with a uniform data- access mechanism for static and dynamic infor- mation, which can be either human-authored (coming from an editorial database such as MusicBrainz, for example) or automatically extracted from multimedia content (as in our similarity example).
Future work The mechanisms we’ve described here pro- vides a way to break down analysis into compo- nent steps and distribute the computation across multiple hosts. For example, we could break down our single built-in predicate mt: similarity into several:14