Undefined 0 (0) 1 1 IOS Press

A Dialogue with Linked Data: Voice-based Access to Market Data in the Sahel

Victor de Boer a,∗ Nana Baah Gyan a Anna Bon b Wendelien Tuyp b Chris van Aart a Hans Akkermans a a Department of , VU University Amsterdam, The Netherlands Email: {v.de.boer, n.b.gyan, c.j.vanaart, j..akkermans}@vu.nl b Center for International Cooperation, VU University Amsterdam, The Netherlands Email: {a.bon, w.tuyp}@vu.nl

Abstract. The Linked Data movement has facilitated efficient data sharing in many domains. However, people in rural developing areas are mostly left out. Lack of relevant content and suitable interfaces prohibit potential users in rural communities to produce and consume Linked Data. In this paper, we present a case study exposing locally produced market data as Linked Data, which shows that Linked Data can be meaningful in a rural, development context. We present a way of enriching the market data with voice labels, allowing for the development of applications that (re-)use the data in voice-based applications. Finally, we present a prototype demonstrator that provides access to this linked market data through a voice interface, accessible to first generation mobile phones.

Keywords: Voice Interface, Linked Data for Africa, Linked Market Data

1. Introduction ate3. Furthermore, there is a need for knowledge shar- ing systems to deal with specific use cases and content Development and use of the Web of Data has un- relevant to people in the target communities. Locally til now mainly focused on developed countries, as was relevant content can be an incentive for people to ac- the case with the Web of Documents before it. 4.5 cess and use the knowledge sharing systems [1]. billion people -mainly in developing countries- cur- Currently, a number of efforts are being undertaken rently can not access the World Wide Web. The rea- to bridge this so-called ’digital divide’ in the World sons for this include infrastructural ones such as a Wide Web, including the recent forming of the Web Foundation. As was argued in [2], while the Web of lack of high bandwidth connections and reli- Documents has been around for 20 years, as engineers able power supplies as well as socio-economic issues of the much newer Web of Data, we have the oppor- such as the high cost of buying Personal Computers, tunity to not let the “digital Linked Data divide” grow language mismatches and lack of reading and writing too large. To avoid a seemingly unbridgeable gap, we abilities [1]. For our case study in Mali, only 1.8% of should consider the underprivileged majority and these the population has Internet access1, only 10% has ac- 2 barriers as we design Linked Data architecture, de- cess to the electricity network , and only 26.2% is liter- scribe use cases and provide access to that Linked Data.

*Corresponding Author 1http://www.internetworldstats.com/ Internet 061129\%20RECIPES\%20country\%20info\%20Mali. World Statistics, Miniwatts Marketing Group. pdf 2http://www.developingrenewables.org/ 3http://www.indexmundi.com/facts/ energyrecipes/reports/genericData/Africa/ indicators/SE.ADT.LITR.ZS Index Mundi 2011

0000-0000/0-1900/$00.00 c 0 – IOS Press and the authors. All rights reserved 2 V. de Boer et al. /

1.1. Mobile technologies 1.3. Contributions

One way of overcoming the infrastructural and liter- The contributions of this paper are the following. acy barriers is to turn to simple mobile phones, GSM architecture and voice technologies to build knowledge – We describe a knowledge sharing use case in sharing services. In Africa, mobile telephony has be- Mali, Africa and a Market Information System come the primary mode of telecommunication [3]. In (MIS) adapted to the local context, RadioMarché 2006, an estimated 45 percent of Sub-Saharan African (RM). Regarding the above mentioned chal- villages were covered by a mobile signal[4]. And in lenges, RM is not dependent on Internet infras- 2009, Africa showed the fastest rate of subscriber tructure, and has voice-based and sms-based in- growth, introducing 96 million new mobile subscribers terfaces. We describe the current status of deploy- in a period of only twelve months [5]. The widespread ment of this system.. availability of mobile phones and increasing level of – The market data gathered using RadioMarché coverage creates great opportunities for new services. have been converted to and exposed as Linked Open Data. In Section 4.1, we discuss the data, 1.2. Linked Data for knowledge sharing in rural its model and opportunities for re-use. We show areas in developing countries how our design choices offers opportunities to Linked Data as a paradigm is very much suitable link aggregated market information to datasets for knowledge sharing in developing countries. Linked from other domains. The resulting data provides Data approaches provide a particularly light-weight an open innovation platform to develop services way to share, re-use and integrate various data sets us- with augmented reasoning capabilities for local ing Web standards such as URIs and RDF. It does not entrepeneurs, NGOs, governments, policy mak- require the definition of a specific database schema for ers, traders and scientists. a dataset [6]. We assume that the majority of the use – We propose a method of adding voice information of the locally produced data will also be consumed lo- to the linked data produced in this way through cally. Although the specifics of the locally produced the use of pre-recorded audio files, accessible data will differ from use case to use case and from re- through the Web, thus allowing for re-use of this gion to region, Linked Data provides us with a standard data in voice-based applications, relevant in the way of integrating the common elements of the data. low-literate, low-infrastructure context. We also Also, because we do not impose a single overarching describe a partial implementation of this voice schema on the data, data reuse for new services is eas- data for the RadioMarché Linked Data case. ier, both within a region and across regions. In Section – We have developed a prototype voice application 4.2, we present an example of a prototype application that uses this data. This proof-of-concept demon- reusing linked market data. strator provides voice-based access to the Ra- An additional advantage is that Linked Data is well- dioMarché linked data. This demonstrator is de- suited for localization efforts. Linked Data is very scribed in Section 4.4. much suited for dealing with multilinguality as its core concepts are resources rather than textual terms. Where the Web of Documents, by design, is language- 2. Related Work specified, Linked Data is designed to be “language ag- nostic”, which suits our purpose of multilingual and We can find many efforts towards the usage of mo- voice-based access well. In Section 4.3, we elaborate bile technologies in an ICT for Development (ICT4D) on our efforts to attach ’voice labels’ to the data. context. In 2013 the International Telecommunication The combination of Linked Data, mobile services Union (ITU) organized a forum discussing how mo- and voice technology allows us to integrate and share bile and broadband connectivity can be a tool in global locally relevant information, while at the same time economic development 4. There are many international allowing local stakeholders to access the data using research initiatives and practitioners turning to mo- available hardware and voice interfaces. In this paper, bile services for information sharing in rural develop- we describe a prototype solution based on Linked Data for a specific use case in the rural Sahel region that allows voice-based access to the data. 4http://www.itu.int/en/wtpf-13/ V. de Boer et al. / 3 ing areas. Examples include SPIDER 5, IDRC 6 or the A related project on Linked Data for developing coun- Nethope network of ICT4D practitioners 7. Success tries is described by Guéret et al. [15]. The Seman- stories of such mobile services include studies in fish- ticXO is a system that connects rugged, low-power, ing villages in India [7], grain markets in Niger [8] or low-cost robust small laptops for empowerment of crop markets in Uganda [9]. poor communities in developing countries. Related work on voice technologies started in the 1930s in research on . The first commercial deployments of voice-based services took 3. The RadioMarché Market Information System place in the early 1970s. Major achievements on language recognition took place in the 1980s and Our efforts center around, RadioMarché [16,17], a 1990s, but this was mainly focused on English. While web-based information system developed within the Text-To-Speech and Speech Recognition are key in VOICES project8 aimed at stimulating agricultural voice application development, the creation of the trade in the Sahel region. The RadioMarché system is a VoiceXML standard by the W3C MIS that supports local farmers of Non-Timber Forest group, in 1999, further facilitated the development of Products (NTFPs) such as honey, tamarind and shea voice applications [10]. nuts. It was introduced under support of a local partner Agarwal et al. from IBM Research India, devel- NGO, Sahel Eco9, in the Tominian Area in Mali. oped a system to enable authorship of voice con- RadioMarché is used to distribute up-to-date mar- tent for 2G phone in a Web space, they named the ket information via community radio in the area. A WWTW (World-Wide Telecom Web). The whole sys- Sahel Eco staff member receives offerings from local tem creates a closed web space, within the phone farmer’s representatives in the form of an SMS text network. Linking from one voice site to the other is message, containing information about a product of- done through a protocol HSTP, created by IBM. Espe- fer: quantity, quality, price, name of the seller, village, cially the lack of open search possibility constrains its phone number, etc. The SMS information is entered growth [11]. manually into a web form. The information is then ac- Several automated Market Information Systems cessible for employees of local community radio sta- (MIS) have been developed and built to support farm- tions through either the Web or through a first genera- ers and agricultural trade in developing countries. One tion mobile-phone accessible voice interface. of the well-known systems is ESOKO [12], an on- For the voice interface, the market data is converted line market system, developed and built in Ghana. ES- to audio files using a slot-and-filler Text to Speech OKO enables sellers and buyers to exchange market (TTS) system [18]. This audio is produced in multi- information. Google started a project in Uganda in ple languages, including "small" languages spoken in 2009, partnering with MTN and Grameen Foundation the region such as Bambara and Bomu. The gener- to develop mobile applications that serve the needs of ated audio is made accessible through an interactive poor and other vulnerable individuals and communi- voice response (IVR) system using the open standard- ties, most of whom have limited access to information ized VoiceXML10 language. Radio station personnel and communications technology [13]. This system is can access the market information in their specific lan- based on SMS but does not allow voice access. guage using their phone and broadcast the audio to the The Web Foundation has started the Open (Gov- public. ernment) Data to "Conduct country level actions and RadioMarché has been deployed from Novem- global actions to increase the impact and benefits of ber 2011-November 2012 in the Tominian region in Open Data worldwide" [14]. This effort focuses on Mali. Feedback from this pilot phase was positive and opening government data in developing countries such showed that indeed the system was used to distribute as Ghana. Our data is initially designed to be produced market information in the region more effectively. In and consumed by the regional farmers themselves. fact, the demand for honey offered became so large Linking our regional data to the (Linked) Open gov- that the supply could not cope [17]. For this reason, ernment data could increase the value of both datasets. the RadioMarché system is at the time of writing not

5https://spidercenter.org/ 8http://www.mvoices.eu 6http://www.idrc.ca/ 9http://www.saheleco.net 7http://nethope.org/ 10http://www.w3.org/TR/voicexml20/ 4 V. de Boer et al. /

linked data store of our prototype are synchronized. The RDF triples are stored in a triple store12. Linked Data Cloud We utilize PURLs for the resource URIs with an

VoiceXML-to- 13 appropriate namespace chosen . An HTTP request to SPARQL

XMLRDF

Market data rewrite rules Linked market data these PURL URIs is redirected to the ClioPatria server.

Locally deployed Through ClioPatria’s Linked Data package, the RDF Voice browser RadioMarché ClioPatria Linked Data server (Voxeo Evolution) instance data is accessible as Linked Open Data. The result of an HTTP request for a resource is either a human- Web access Voice access readable web page or the raw RDF triples describing the resource depending on the accept header in the HTTP request. For example rm:village_Samoukuy/ shows all information about the Samoukuy village. A Fig. 1. Schematic representation of the linked market data prototype SPARQL endpoint is also provided14. demonstrator. As of March 2013, 90 market offerings are in the triple store. These market offerings are made by 17 collecting and disseminating new offerings. However, different farmers, living in 16 different villages spread Sahel Eco is currently looking to restart the service, across 6 regional “zones”. The market offerings con- embedding the service in other efforts that stimulate tain the quality, quantity and type of the product as well the value-chain. One key lesson here is that even a suc- as the price and contact information. Furthermore, of- cessful ICT deployment is not sustainable if further up ferings are grouped in so-called “communiqués”, de- or down the value-chain the requirements created by signed to be sent to local radio’s for broadcast. In total, the new situation can not be met. the market data consists of 1,952 triples. We have manually added a number of mapping links: 4. The Linked Market Data Demonstrator – Zones and villages are mapped to GeoNames15 concepts. Not all villages identified in the GeoN- In this section we describe a prototype demonstra- ames dataset, speaking to the remote nature of the tor developed in parallel which exposes the market data. data gathered through the RadioMarché system using – The product types were mapped to the Agrovoc 16 Linked Data approaches, so that new opportunities for thesaurus which contains in up to 21 languages product and service innovation in agriculture and other covering topics related to food, nutrition, agricul- domains can be unleashed. ture, fisheries, forestry, environment and other re- lated domains. – Some zones and villages as well as products are 4.1. The linked market data mapped to DBpedia17 resources. – To facilitate reuse of this e-business data, we have The high-level setup of the Linked Data prototype therefore established a partial mapping from our demonstrator and how it relates to RadioMarché is schema to the GoodRelations vocabulary [21]. In shown in Figure 1. To obtain the Linked Data, we total 7 classes and 9 properties are mapped to translate a copy of the up-to-date market informa- GoodRelations classes and properties. tion from the RadioMarché prototype deployed in the

Tominian region to RDF triples using the XMLRDF 12 11 The triple store is available at http://semanticweb.cs. tool of the ClioPatria semantic framework [19]. The vu.nl/radiomarche/ XMLRDF tool allows us to convert XML documents 13http://purl.org/collections/w4ra/ to arbitrary RDF structures in an interactive way, us- radiomarche/ 14 ing simple RDF graph rewriting rules. The conversion http://semanticweb.cs.vu.nl/radiomarche/ sparql/, an interactive SPARQL query environment is provided is run whenever the RadioMarché database is updated at http://semanticweb.cs.vu.nl/radiomarche/ to ensure the database of the deployed version and the flint/ 15http://www.geonames.org/ 16http://www.fao.org/agrovoc/ 11http://semanticweb.cs.vu.nl/xmlrdf/ 17http://en.dbpedia.org/ V. de Boer et al. / 5

Fig. 2. Screenshot of the "Ecomash" prototype application. This application uses linked external information from GeoNames, DBPedia and Agrovoc to display data on a map and provide background information [20].

– Similarly, we have added 5 mappings from RM ité”@fr). For widely spoken world languages, such as classes and properties to the FOAF vocabulary18. English, French or Chinese, good off-the-shelf TTS systems are available, which can be used by voice- 4.2. Ecomash based systems to utter a product name in that lan- guage. However, for smaller languages, these TTS sys- To show the added value of linking the locally pro- tems do not exist. In our case, the Bambara and Bomu duced market data to these external sources on the Web languages of the RadioMarché stakeholders lack such of Data, we have developed prototype mashup applica- systems. In this case, augmenting the data with voice tion, Ecomash. This application was based on require- labels increases the opportunities for re-using this mar- ments put forward by our local NGO partner. Figure ket data for voice-applications. Since this concerns a 2 shows a screenshot of this mashup application. This limited amount of data within a closed domain, it is application is designed to be used by this and other feasible to record voice labels for the data items. NGOs to monitor local markets. Local product offer- We propose voice labels, that are complementary to ings can be filtered in an interactive table or browsed rdfs:label constructs and link web-accessible wave via an interactive map (using the GeoNames links) and files, identified through URIs to RDF resources. For external product information (from the DBPedia and this purpose, we introduce the speakle:voicelabel Agrovoc links) are presented to the user[20]. properties19. To allow for multilingual voice labels, we furthermore define language-specific voice la- 4.3. Linked voice data bel subproperties (speakle:voicelabel_en, speakle: voicelabel_fr etc.). Rather than have a literal as RDF allows for a single resource, identified by a range, the voice label property points to an audio web URI (eg. rm:shea_nuts) to have multiple such as a wave file. Voice applications can labels (eg. “Shea Nuts”@en and “Amande de Kar-

19The current namespace for the speakle schema is http:// 18http://xmlns.com/foaf/spec/ purl.org/collections/w4ra/speakle 6 V. de Boer et al. /

rdfs:label “1000”

speakle:voicelabel_ba rm:1000 rm:audio_1000_ba.wav

speakle:voicelabel_nl rm:audio_1000_nl.wav

rm:offering0001 rdfs:label “kilo”@en

speakle:voicelabel_ba rm:kilo rm:audio_kilo_ba.wav

speakle:voicelabel_nl rm:has_contact rm:audio_kilo_nl.wav

rdfs:label “Amande de Karité”@fr

“Shea Nuts”@en rdfs:label

rm:shea_butter speakle:voicelabel_ba rm:audio_shea_ba.wav rm:Mazankuy_Diarra rm:audio_shea_nl.wav speakle:voicelabel_nl

Fig. 3. Example RDF graph snippet of linked market data in the triple store. Resources have URIs and are repre- sented using ellipses, typed relations between resources and to literal labels are represented using arrows. The namespace http://purl.org/collections/w4ra/radiomarche/ is abbreviated to rm:. The figure shows how web-accessible multilingual audio files are related market data resources. Note that only a small part of the market data related to offerings and product types is shown. fetch these audio files when needed and insert them phone. As a proof-of-concept, we have implemented into the voice as needed. The speakle a voice-based interface to the linked market data. The schema is currently very limited but can be extended voice application is built using VoiceXML [10], the to allow -for example- for audio labels of different industry standard for developing voice applications. speakers. In Figure 3, we show how in the case of The application allows limited browsing of the latest our linked market data audio and literal labels can be market offerings. We have implemented a partial bi- mixed, for different languages. As a proof of concept, lingual interface, in English and Dutch, using the voice for our Linked Market Data demonstrator, we recorded labels in the voice interface. a number of audio labels in Dutch and English20 and The prototype voice application is running on the added them to the Linked Data using the voice label Voxeo Evolution platform21. This platform includes a triples. voice browser, which is able to interpret VoiceXML documents, includes (English) TTS and provides a 4.4. Voice-based access to Linked Data number of ways to access the Voice application. These include the Skype VoIP number +99000936 As presented in the previous sections, the Ra- 9996162208 and the local (Dutch) phone number dioMarché linked market data can be browsed through +31208080855. When any of these numbers is called, a web interface or can be re-used in mashup appli- the voice application accesses a VoiceXML document. cations. However, as stated, our goal is to provide This document defines the dialogue structure for the a voice-based interface that allows market informa- application. In the current demonstrator, the caller is tion access for users having a first-generation mobile presented with two options, to browse the data by prod-

20Languages spoken by the authors 21http://evolution.voxeo.com V. de Boer et al. / 7

uct or region, or to listen to the latest offering. The caller presses the code on his or her keypad (Dual Tone

Multi-Frequency or DTMF). The voice application in- terprets the choice and forwards the caller to a new These are the top three current offerings for Tamarin voice menu. For products, the caller must select the type of product (“press 1 for Tamarind”, “press 2 for query is then passed to the Linked Market Data server, which returns the appropriate results in the correct lan-
guage. For a product query, all (recent) offerings about that product are returned, including the links to the wave files of the quantity, unit, contact information etc. Fig. 4. Example VoiceXML code snippet, generated by the PHP For each result the PHP code inserts the URLs of the code, using the results from a SPARQL query. For brevity, the URLs of the audio files are abbreviated (...). Note that the introduction “The voice resource into the VoiceXML code at the appro- following...”, “offers” and “for” are to be interpreted by the (in this priate spot. In this way, a sequence of audio files is case English) TTS. These utterances are considered part of the ap- played to the caller, providing him or her with the re- plication and it would be the responsibility of the application devel- quested information. Figure 4 shows a fragment of the oper rather than the (linked) data provider to provide audio versions VoiceXML code. for them. Of course, the current method of accessing the data as the application-specific utterances would have to be is only one of many possible actions. In a different recorded locally. voice application a caller could be presented with ad- vanced filtering options (“enter the maximum price for 5.1. Sustainability offerings of product X”, “enter a date range for prod- uct offerings”) or combinations of data queries. How- As stated in Section 3, the RadioMarché service is ever, because of the slow and linear nature of voice in- terfaces -when compared to visual user interfaces- op- currently online but not accepting or distributing new tions have to be limited more than with visual inter- offerings due to issues in different parts of the value faces. This means that in our research we will iden- chain. Sahel Eco is currently looking at restarting the tify useful services on this data and provide Voice-to- service, which would result in a steady flow of new SPARQL mappings for these services. market information into the database and therefore in the linked data store. To ensure sustainability, we are currently transfer- 5. Discussion and work in progress ring the hardware and software that runs the service to a local business partner22. Local ownership and con- The previous sections describe first efforts into ex- trol is one of the key issues for the sustainability of posing locally produced data in rural areas in devel- rural ICT projects [22]. Within the VOICES project, oping countries as linked data. We have described a multiple business models have been developed to an- small prototype voice application that provides access alyze the financial sustainability of the RadioMarché to this data. Currently, to allow for easy testing, this ac- project [23]. cess is in English and Dutch only. However, for the Ra- One issue concerns the costs of phone calls for ac- dioMarché system, a great number of audio recordings cessing the voice interface. Interviews with stakehold- -used in the slot-and-filler TTS- are available that cor- ers confirm a willingness to pay for phone calls to a respond to the data items in the Linked Market Data. system, given that some economic benefit is gained We can use these to add Bambara and Bomu voice la- from the call. However, we are looking at ways of lim- bels to the linked data. We are currently planning on realizing this full integration. Any missing items, such 22http://yeleman.com/ 8 V. de Boer et al. / iting these costs. One opportunity is allowing for ac- Before deploying a next version of the system, user cess to data via (cheaper) SMS text messages. Simple evaluation of the voice interface and the system as a text queries can be processed by the MIS and replies whole is necessary. can be presented in SMS text messages. An added benefit is that the answer is stored on the user’s de- 5.2. Related linked data sets under production vice. However, in our specific case study, most users have a low literacy level and are not comfortable using In a number of related projects, we are producing text messages. The arrival of higher-generation mobile linked data sets that will be related to the linked market phones (i.e. feature phones) allows the development of data as well as to external sources. text-to-voice applications that can generate voice mes- Meeting Scheduler. Within the VOICES project we sages out of the text. This would overcome the lit- have developed a voice system for a second use case: eracy barrier, while still cutting costs. However, cur- a meeting scheduling system [17]. This system “Ta- rently such phones are not available in our case study bale” provides local NGOs with a more effective way context. to transfer agricultural knowledge about non-timber Another way of reducing costs for end users is forest products to their farmer community. The ser- vices developed in this case study provide voice access having other stakeholders reimburse costs made for to personal and scheduling information. By integrating calling. We are currently investigating various value this information with the market information from Ra- propositions which involve different local stakeholders dioMarché, personal profiles can be enriched with in- including local entrepreneurs, NGOs, farmers, poten- formation about the type of products that specific farm- tial buyers and community radio stations broadcasting ers have been producing within a given period. Here a the market data. new scheduling and notification service can re-use the Currently, the linked market data and the voice inter- market information within a region. face demonstrator are still maintained using the Voxeo Pluvial data. We have developed a crowdsourcing Evolution platform, PURL servers and the VU Univer- platform to transform photocopied data about rainfall sity Amsterdam web server. The voice application is in the Bankas area in Mali to Linked Open Data [24]. also only reachable through a Dutch local phone num- This platform targets the ‘diaspora’, e.g. people origi- ber or Skype access. Sustainability of the linked data nally from the region that have since moved to devel- and the client application is currently ensured through oped countries, where they might have better access the stability of the Linked Data platform at the Univer- to web browsers. The pluvial Linked Data acquired sity and the ability to convert new data as it is gathered in this way will be linked to the aforementioned data. in the RadioMarché system. Even beyond the VOICES This can be exploited by our partner NGO as well as project, this service will be maintained at the univer- other NGOs to analyze for example correlations be- sity. However, to ensure sustainability in the long run, tween rainfall and market offerings. this infrastructure would also have to be moved to IDS data. The Institute for Development Studies re- the rural regions itself as much as possible. The Or- cently published an API exposing more than 30.000 ange Emerginov platform23 can provide the web server publications about development research24. We have and voice browser technology needed for this infras- developed a wrapper around the IDS API to expose its tructure and include local Malian phone numbers. The content as high quality Linked Data, enriching it with Linked Data servers, voice-interfaces and client ap- connections to other Linked Data datasets25. These in- plications can be moved to this platform at testing or clude both general datasets such as DBPedia or GeoN- deployment time. A second option is entirely local. ames as well as datasets with information from devel- This version has the data and applications running on oping countries that are currently being realized. a web-connected dedicated laptop that is be deployed IATI data We have also produced a linked data locally. The voice channel is provided by a local voice from the International Aid Transparency Initiative 26 browser and a GSM gateway (2N OfficeRoute) device (IATI) , which lists transparency data about develop- connected to the laptop that allows phone calls to be ment projects. The data is linked to external resources handled by the system on the laptop. 24http://api.ids.ac.uk 25http://api2lod.appspot.com/ 23http://www.emerginov.org/ 26http://iatistandard.org/ V. de Boer et al. / 9 to reuse and visualize this data. Our aim is to link this tical, we actually have gathered most of the French, more general development-related data to the results of Bambara and Bomu audio fragments over the course of those development efforts in the field, through for ex- the project. We are currently converting some of these ample funding organizations. By linking the data that audio resources and linking them to the data as voice describes project results to the IATI data, NGO’s can labels in the manner described in 4.3. As far as gath- increase their own transparency and accountability. At ering new voice fragments is concerned, for specific the same time, we are investigating voice-based access domains scripts have to be constructed that cover all of to this transparency linked data using the voice labels the data items. As an illustration of the effort needed, as described here, which would allow people in rural we have made an example of such a script used in the areas to query the aid data that concerns them. RadioMarché MIS available for download27. Further- By integrating the Linked Market data with these more, future work to show the added benefit of the and other datasets, we can create a Web of Data that Linked Data approach is to also develop different voice is relevant to as well as accessible for people living in applications that reuse these audio linked market data developing areas, bridging the digital divide. voice labels. We are currently also looking at adding voice labels to different linked data, for example cultural heritage 6. Conclusions and further work information in the Netherlands. This could be used in a voice-based application aimed at -for example- users In this paper, we have described a Linked Data set with accessibility issues. derived from a Market Information System currently We are also investigating methods of gathering deployed in Mali, Africa. This Linked Data set is up- to-date and available for re-use. This dataset represents voice fragments using crowdsourcing initiatives [25]. data that is relevant to the different stakeholders in the Crowdsourcing can be a good way of producing large specific region and therefore shows that Linked Data amounts of voice fragments for relatively small cost. could be valuable within that region. We have estab- This would allow for larger-scale projects and make lished links to related datasets and vocabularies allow- further progress towards a voice accessible Web of ing for new applications that use the integrated data. Data. We show one such mashup application, the Ecomash prototype. This case study shows that indeed meaning- ful Linked Data can be produced and used in a rural, Acknowledgements development context. We show that this data can be enriched by establishing links to external sources. We The authors would like to thank Mary Allen and also describe how this data can be reused in new and Amadou Tangara of Sahel Eco, Stephane Boyera, Max useful ways, including through a prototype mashup ap- Froumentin, Aman Grewal and the farmers and ra- plication. dio employees in Mali for their collaboration. We We also show how the linked data paradigm allows would also like to thank Christophe Gueret, Henk the market data to be easily expanded with audio re- Kroon and Rokhsareh Nakhaie for their involvement. sources, allowing for development of applications that This research is partly funded by the European Union (re-)use the linked data in voice-based applications. through the 7th Framework Programme (FP7) under We have also developed one such voice-based applica- grant agreement Num. 269954. tion allowing access to the linked market data. In terms of this specific case, future work includes the further development and user testing of the ap- plication. For this, further localization is needed and References requires adding French, Bambara and Bomu audio labels. Using this, we can recreate the current Ra- [1] Stephane Boyera, W3C. Mobile mobile web for social devel- dioMarché voice user interface using the (Voice) opment roadmap. http://www.w3.org/TR/2009/NOTE-mw4d- Linked Data. The choice for English and Dutch makes roadmap-20091208/, 2009. that the proof-of-concept is not directly deployable in rural areas in Mali as is. However, since the Ra- 27http://www.few.vu.nl/~vbr240/voices/ dioMarché system’s audio fragments are nearly iden- recording_script.pdf 10 V. de Boer et al. /

[2] Christophe Guéret, Stefan Schlobach, Victor de Boer, Anna [16] Victor de Boer, Pieter De Leenheer, Anna Bon, Nana Baah Bon, and Hans Akkermans. "is data sharing the privilege of a Gyan, Chris van Aart, Christophe Guèret, Wendelien Tuyp, few? bringing linked data to those without the web". Outra- Stephane Boyera, Mary Allen, and Hans Akkermans. Ra- geous Ideas at International Semantic Web Conference (ISWC diomarché: Distributed voice- andweb-interfaced market in- 2011). Jury award winning paper. 1st Place, 2011. formation systems under rural conditions. In Accepted for [3] UNCTD. Science and technology for development: the new publication in Proceedings of 24th International Conference paradigm of ict. Information Economy Report 2007-2008, on Advanced Information Systems Engineering, CAiSE’2012, United Nations Conference on Trade and Development, 2007. Gdansk, Poland, 25 U˝ 29 June 2012, 2012. [4] ITU. Telecommunications/ict markets and trends in africa. In- [17] Nana Baah Gyan, Victor de Boer, Anna Bon, Chris van Aart, ternational Telecommunications Union report, 2007. Stephane Boyera, and Hans Akkermans. Voice-based web ac- [5] The Economist. Mobile marvels. Special report on telecoms cess in rural africa. Proceedings of ACM Web Science 2013, in emerging markets. The Economist, September 2009, 2009. Paris, France, May 2-4, 2013., 2013. [6] John Domingue, Carlos Pedrinaci, Maria Maleshkova, Barry [18] Etienne Barnard, Marelie H. Davel, and Gerhard B. van Norton, and Reto Krummenacher. Fostering a relationship Huyssteen. Speech technology for information access: a south between linked data and the internet of services. In John african case study. In Proceedings of the AAAI Spring Sym- Domingue, Alex Galis, Anastasius Gavras, Theodore Za- posium on Artificial Intelligence for Development (AI-D), Palo hariadis, Dave Lambert, Frances Cleary, Petros Daras, Srd- Alto, California, March 2010, pages 8–13, 2010. jan Krco, Henning Müller, Man-Sze Li, Hans Schaffers, [19] Guus Schreiber, Alia Amin, Lora Aroyo, Mark van Assem, Volkmar Lotz, Federico Alvarez, Burkhard Stiller, Stamatis Victor de Boer, Lynda Hardman, Michiel Hildebrand, Borys Karnouskos, Susanna Avessta, and Michael Nilsson, editors, Omelayenko, Jacco van Osenbruggen, Anna Tordai, Jan Wiele- The Future Internet, volume 6656 of Lecture Notes in Com- maker, and Bob Wielinga. Multimedian e-culture demonstra- puter Science, pages 351–364. Springer Berlin / Heidelberg, tor. In The Semantic Web - ISWC 2006, Athens, Georgia, 2011. volume 4273 of LNCS, pages 951-958, Winner Semantic Web [7] Robert Jensen. The digital provide: Information (technology), Challenge 2006. Springer Verlag, November 2006, 2006. market, performance, and welfare in the south indian fish- [20] Henk Kroon. Ecomash: Linked market data for sa- eries sector. The Quarterly Journal of Economics, Issue 3, hel eco. M.Sc. Thesis VU University Amsterdam CXXII(879-924):879–924, August 2007. http://www.few.vu.nl/ vbr240/teaching/Henk Kroon - Master [8] Jenny Aker. Does digital divide or provide? the impact of cell thesis.pdf, 2012. phones on grain markets in niger. Center for Global Develop- [21] Martin Hepp. Goodrelations: An ontology for describing prod- ment Working Paper, (154), 2008. ucts and services offers on the web. In Knowledge Engineer- [9] Megumi Muto and Takashi Yamano. The impact of mobile ing: Practice and Patterns, pages 329–346. Springer, 2008. phone coverage expansion on market participation: Panel data [22] Caroline Pade, Brenda Mallinson, and David Sewry. An ex- evidence from uganda. World Development, 37(12):1887– ploration of the critical success factors for the sustainability of 1896, 2009. rural ict projects–the dwesa case study. In Information Systems [10] W3C. Voice extensible markup language (voicexml) Development, pages 339–352. Springer, 2009. version 2.0. W3C Recommendation 16 March 2004 [23] Hans Akkermans, Chris van Aart, Victor de Boer, Nana Baah http://www.w3.org/TR/voicexml20/, 2004. Gyan, Anna Bon, Wendelien Tuyp, and Amadou Tangara. m- [11] Sheetal Agarwal, Arun Kumar, Amit Anil Nanavati, and Niten- agro knowledge sharing field pilot U˝ final evaluation. VOICES dra Rajput. The world wide telecom web browser. In Proceed- project deliverable document D5.4 (unpublished), 2013. ings of the First ACM Symposium on Computing for Develop- [24] Victor de Boer, Michiel Hildebrand, Lora Aroyo, Pieter ment, ACM DEV ’10, pages 4:1–4:9, New York, NY, USA, De Leenheer, Chris Dijkshoorn, Binyam Tesfa, and Guus 2010. ACM. Schreiber. Nichesourcing: Harnessing the power of crowds [12] ESOKO. Esoko. http://www.esoko.com/, 2011. of experts. In Annette Teije, Johanna Volker, Siegfried [13] Google AppLab. Google sms to serve needs of poor in Handschuh, Heiner Stuckenschmidt, Mathieu dAcquin, Andriy uganda. http://blog.google.org/2009/06/google-sms-to-serve- Nikolov, Nathalie Aussenac-Gilles, and Nathalie Hernandez, needs-of-poor-in.html, 2009. editors, Knowledge Engineering and Knowledge Management, [14] World Wide Web Foundation. Open government data. volume 7603 of Lecture Notes in Computer Science, pages 16– http://www.webfoundation.org/projects/ogd/ retrieved 14-03- 20. Springer Berlin Heidelberg, 2012. 2012, 2012. [25] Rokhsareh Nakhaie. Designing the open voice platform: A [15] Christophe Guéret and Stefan Schlobach. Semanticxo : con- multi-lingual, crowd-sourced, and open platform to enable the necting the xo with the world’s largest information network. creation of voice based applications for use in internet-deprived In Proceedings of the First International Conference on eTech- environments. M.Sc. Thesis VU University Amsterdam nologies and Networks for Development, ICeND2011, Com- http://www.few.vu.nl/ vbr240/teaching/Rokhsareh_Nakhaie_ munications in Computer and Information Science, Springer Master_thesis.pdf, 2012. LNCS, 2011.