Open Data Kit: Tools to Build Information Services for Developing Regions
Total Page:16
File Type:pdf, Size:1020Kb
Open Data Kit: Tools to Build Information Services for Developing Regions Carl Hartung Yaw Anokwa Waylon Brunette Computer Science and Computer Science and Computer Science and Engineering Engineering Engineering University of Washington University of Washington University of Washington Seattle, WA 98195 Seattle, WA 98195 Seattle, WA 98195 [email protected] [email protected] [email protected] Adam Lerer Clint Tseng Gaetano Borriello Computer Science Computer Science and Computer Science and Massachusetts Institute of Engineering Engineering Technology University of Washington University of Washington Cambridge, MA 02139 Seattle, WA 98195 Seattle, WA 98195 [email protected] [email protected] [email protected] ABSTRACT Keywords This paper presents Open Data Kit (ODK), an extensible, mobile computing, mobile phones, ICTD, client-server dis- open-source suite of tools designed to build information ser- tributed systems vices for developing regions. ODK currently provides four tools to this end: Collect, Aggregate, Voice, and Build. Collect is a mobile platform that renders application logic 1. INTRODUCTION and supports the manipulation of data. Aggregate provides Over the last fifty years, advances in information and com- a \click-to-deploy" server that supports data storage and munication technologies (ICTs) have transformed the way transfer in the \cloud" or on local servers. Voice renders we create, retrieve, update, and delete information. Despite application logic using phone prompts that users respond to this revolution in information management, much of the with keypad presses. Finally, Build is a application designer world has not benefited from these technological advance- that generates the logic used by the tools. Designed to be ments. To address many of these disparities, there has been used together or independently, ODK core tools build on ex- a push from development agencies to apply evidence-based isting open standards and are supported by an open-source development wherein best available data is used to inform community that has contributed additional tools. We de- development challenges. scribe four deployments that demonstrate how the decisions In a sense, this approach is not new. From agricultural made in the system architecture of ODK enable services that extension to immunization campaigns, services that push can both push and pull information in developing regions. and pull information in developing regions have been at the heart of global development. However, even with many years of practice, providing these services is still a difficult task. Current practice, which is primarily paper-based, limits the Categories and Subject Descriptors scale and complexity of the services that can be provided, H4.3 [Information Systems Applications]: Communi- and thus the impact of the intervention. cations Applications; H5.2 [Information Interfaces and With the growth of mobile phone usage in these regions [38], Presentation (e.g., HCI)]: User Interfaces; C2.4 [Distributed there have come opportunities to digitize and automate many Systems]: Client-server, distributed applications of these services in a cost effective manner. Of course, com- puting is no panacea, as noted by Toyama et al. [50] and Brewer et al. [30]. Challenges ranging from user limitations General Terms to infrastructure constraints have proven to be particularly pernicious. In the rare instances where the introduction of Design, Human Factors computing has been successful, implementation has often required a level of technical expertise not readily found in situ [39]. For computing to truly address the information gaps in de- Permission to make digital or hard copies of all or part of this work for veloping regions, information services must be composed by personal or classroom use is granted without fee provided that copies are non-programmers, deployed by resource-constrained organi- not made or distributed for profit or commercial advantage and that copies zations, used by minimally-trained users, and remain robust bear this notice and the full citation on the first page. To copy otherwise, to despite intermittent power and connectivity. To address republish, to post on servers or to redistribute to lists, requires prior specific these challenges, we developed Open Data Kit (ODK) [17], a permission and/or a fee. ICTD2010 December 13-15, 2010, London, U.K. modular, extensible, and open-source suite of tools designed Copyright 2010 ACM 978-1-4503-0787-1/10/12 ...$10.00. to empower users to build information services for develop- ing regions. ODK currently consists of four tools: Collect, these is CyberTracker [4], a system first developed in the Aggregate, Voice, and Build. mid-1990s as a way to enable non-literate animal trackers ODK Collect is a mobile platform that renders complex to record observations on PDAs (sometimes with attached application logic and supports the manipulation of data types GPS units) using a purely graphical and non-linear interface. that include text, location, images, audio, video, and bar- Trackers, when observing a specific animal behavior, tap a codes. ODK Aggregate provides a \click-to-deploy" server representative icon on the screen to mark that behavior. For that supports data upload, storage and transfer in the\cloud" applications such as socio-economic surveys, CyberTracker as well as on local servers. ODK Voice renders application replaced animal behavior icons with icons representing fam- logic using automated phone prompts that users respond ilies, houses, and marriage status. to with keypad presses. Finally, ODK Build is a drag-and- CyberTracker is still in wide use today and has added drop application designer that generates the logic used by functionality including a form designer, data synchroniza- the tools. tion over the web, and image capture. While these up- Designed to be used together or independently, ODK tools grades build toward a more generic system, they do not build on existing open standards and empower individuals change the fundamental use case and interactions. That and organizations to compose services that collect and dis- is, CyberTracker is designed for gathering large quantities tribute information in the developing world. ODK is sup- of geo-referenced data for illiterate field observers and syn- ported by an open-source community that has contributed chronizing those observations to a local computer. training documents, localization support, as well as addi- For broader use cases than CyberTracker targets, Pen- tional tools. dragon Forms [20, 1] has been a popular and fully-featured Examples of how ODK can be used include: commercial solution that includes a form designer, data syn- chronization, multimedia support and forms with navigation • Government workers completing socio-economic sur- logic. Although designed for developed regions, Pendragon veys about households in a district. Forms has also been used all over the world [29, 48]. • Agricultural extension workers creating an application Our work differs from Pendragon Forms along four dimen- with video and audio clips explaining farming tech- sions that are critical for resource-constrained environments: niques. cost of deployment, ease of extensibility, available devices, and data transport. For the functionality ODK provides for • Teachers implementing games with interactive ques- free, Pendragon Forms requires $80 per user. Additionally, tions and answer tutorials and automatic score record- because of our open-source license, organizations are free ing. to modify and customize the applications as needed. Fi- nally, ODK runs on a variety of phones, netbooks, tablets • Crisis workers capturing images and locations of dam- and supports multiple methods for transferring data to other aged areas after an earthquake. services. There are free and open-source competitors to Pendragon • Funders receiving geo-tagged reports of interventions Forms that we also considered. Java Platform, Micro Edi- they have supported. tion (J2ME) phone-based data collection clients such as Front- • Clinicians building decision support applications that lineForms [9], EpiSurveyor [8], CommCare [3], and JavaRosa [14] use patient data to help determine when to administer have become popular as the prices of Java-enabled phones tests. have fallen. Unfortunately, these phones live in a frag- mented ecosystem that negatively impacts software devel- • Microfinance institutions tracking transactions from opment and usability. lenders and borrowers. Applications must often be signed by the vendor, carrier, or manufacturer before interactions with storage, network- • Indigenous tribes cataloging their trees to enable par- ing, or hardware accessories are usable. Without the appro- ticipation in global carbon markets. priate digital certificates and signatures, users are prompted • Community health workers managing household visits with confusing dialogs before every such action. The sign- to pregnant women. ing process can require months of waiting and thousands of dollars. Even after signing authority is obtained, cap- In this paper, we describe how ODK differs from previous turing images, audio, video, and location remains difficult work, detail the set of tools we currently provide, and eval- because each device implements the interface to its under- uate four ongoing deployments. We also discuss how the lying hardware differently. J2ME programmers are forced design decisions made in the system architecture of ODK to test every software release on each physical device they