Point-of-use Soil Diagnostics: An Actionable Information System for Resource Constrained Farmers

by

Soumya Braganza

B.E. Electronics and Communications Engineering Birla Institute of Technology, Mesra, 2010

SUBMITTED TO THE INSTITUTE FOR DATA, SYSTEMS, AND SOCIETY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

MASTER OF SCIENCE IN TECHNOLOGY AND POLICY AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY

JUNE 2016

©2016 Massachusetts Institute of Technology. All rights reserved.

Signature of Author: ______

Institute for Data, Systems, and Society May 12, 2016

Certified by: ______

Chintan Vaishnav Senior Lecturer, MIT Sloan School of Management Thesis Supervisor

Accepted by:______

Munther Dahleh William A. Coolidge Professor, Electrical Engineering and Computer Science Director, Institute for Data, Systems, and Society Acting Director, Technology and Policy Program

1

2

Point-of-use Soil Diagnostics: An Actionable Information System for Resource Constrained Farmers

by

Soumya Braganza

Submitted to the Institute for Data, Systems, and Society on May 12th 2016, in partial fulfillment of the requirements for the degree of Master of Science in Technology and Policy

Abstract

During the mid-1960s, came to the brink of an acute food crisis in the midst of heavy dependence on food imports. A period of rapid agricultural modernization that followed, known as Green Revolution, transformed India from a net importer of food into an exporter. Although an appropriate response for abating the impending starvation, the Green Revolution inflicted several unintended consequences. For example, regulatory structure and fertilizer subsidies for urea that were designed to stimulate growth instead resulted in a lock-in, which in turn incentivized vast over-fertilization across the country. Today, this is a well-recognized problem, and the has announced policies and schemes such as the National Soil Health Card Scheme to increase knowledge of soil condition and curb fertilizer use. In reality, however, the current need for information on soil health far exceeds the capacity for soil testing, highlighting the need for a radical approach to meeting this policy objective.

This project, undertaken in collaboration with MIT Mechanical Engineering, takes a two-part approach to addressing this problem, with the design of a point-of-use soil testing sensor and an accompanying recommendation generation engine. This thesis presents the design of the latter based upon the answer to the following question: what constitutes an actionable information for resource constrained farmers? To answer it, we use a mixed methodology approach comprising (i) a combination of stakeholder interviews and design workshops to elicit user needs, and (ii) controlled experimentation with over 200 farmers covering an entire village to measure the actionability of information in soil health recommendations. The results of the analysis of experimental data reveal that the actionability of recommendations varies significantly within the population of farmers tested, and can be attributed to the level of information provided, the environment in which a farmer receives a recommendation, gender, and education level. Consequently, an effective point-of-use diagnostic system must adjust for these factors in order to maintain high actionability. To that end, we then use the experimental results to design a recommendation generation engine, the core of which is a soil health database that maximizes the actionability of information for a resource constrained farmer.

Thesis Supervisor: Chintan Vaishnav Title: Senior Lecturer

3

Acknowledgements

First and foremost, I want to thank my thesis supervisor, Chintan Vaishnav, for his support and guidance through this project. He consistently allowed this work to be my own, but steered me in the right direction whenever he thought I needed it. I admire Chintan for his dedication to this project and the passion with which he strives to bring the fruits of research at MIT to the underprivileged in India.

I am immensely grateful to the MIT Tata Center for Technology and Design for making this project possible, and for providing me with a wealth of resources, support, and guidance over the last two years. The opportunity to directly interact with the individuals whom my research is meant to serve is a truly unique experience that only the Tata Center could have provided. I am proud to call myself a Tata Fellow.

A very special thank you to Ron Rosenberg, my research buddy and travel companion. Field trips to India would not have been as much fun without you! Thank you also to Leah Slaten, the amazing UROP who contributed to this project.

This work would not have been possible without the support of The Deshpande Foundation, The Himmothan Society, IARI, and numerous others who supported us in our trips to India. A special thank you to Naveen, Manjunatha, Innus, and Kusuma at the Deshpande Foundation, and Malavika at the Himmothan Society, all of whom cheerfully devoted numerous hours towards making our field work a success.

To the TPP administration, thank you for always being supportive and accommodating throughout my time at MIT. I feel so lucky to have been a part of TPP, surrounded by some of the smartest and most interesting people I know.

Finally, I must express my very profound gratitude to my family, and especially to my husband Siddharth, for providing me with unfailing support and continuous encouragement throughout my time at MIT and through the process of researching and writing this thesis. This accomplishment would not have been possible without you.

4

Contents 1. Introduction ...... 8 2. Methodology Overview & Research Question ...... 12 2.1 Research Question...... 12 2.2 Methodology Overview ...... 12 2.2.1 Field Research ...... 12 2.2.2 Policy Analysis ...... 13 2.2.3 Actionability Experiment ...... 13 2.2.4 Database Design ...... 13 3. Problem Finding ...... 14 3.1 Review of Literature ...... 14 3.2 Gap Analysis ...... 17 3.2.1 Policy Gap ...... 18 3.2.2 User Needs Gap ...... 22 3.2.2.1 Stakeholder Interviews ...... 22 3.2.2.2 Interactive Workshops ...... 30 3.2.2.3 Product Contract ...... 43 3.3 Discussion of Research Question ...... 44 4. Actionability Experiment: Data and Analysis ...... 45 4.1 Motivation ...... 45 4.2 Experimental Design ...... 45 4.3 Examination of Data ...... 49 4.4 Components of Actionability ...... 52 4.5 Analysis of factors affecting components of Actionability ...... 53 4.5.1 Model 1: ‘Interpret’ ...... 53 4.5.2 Model 2: ‘Ease’ ...... 55 4.5.3 Model 3: ‘Acquire’ ...... 56 4.5.4 Model 4: ‘Afford’ ...... 57 4.5.5 Discussion ...... 59 4.6 Formulation of an Actionability Index ...... 60 4.7 Lessons for Database Design ...... 64 5. Soil Database and Recommendation Engine...... 66 5.1 Functional Overview ...... 66 5.2 Database Schema ...... 66 5.3 Recommendation Generation ...... 69 5.4 Future Improvements ...... 73 6. Conclusion and Future Work ...... 75 Bibliography ...... 77 Appendices ...... 80 Appendix A: Sample recommendation reports from experiment ...... 80 Appendix B: Sample survey sheet for data collection ...... 85 Appendix C: Kernel regressions of actionability components to assess relationships with key explanatory variables ...... 87 Appendix D: Statistical analyses of Actionability indices ...... 91 Appendix E: Sample recommendations generated by recommendation engine ...... 93

5

List of Figures

Figure 1.1 Agricultural sector contribution to GDP is falling sharply ...... 8 Figure 1.2 India’s average crop yields of major crops are far below global averages ...... 9 Figure 3.1 Design flow depicting translation of user inputs into design requirements ...... 22 Figure 3.2 (Left) Interview with farmers in Hubli region, Karnataka, (Right) Interview with farmers in Dehradun region, ...... 24 Figure 3.3 (Left) Group interview with women in Hubli region, Karnataka, (Right) Interview with women at their home in Dehradun region, Uttarakhand ...... 26 Figure 3.4 Sugar processing plant at Godavari Biorefineries Ltd., Karnataka ...... 28 Figure 3.5 Product attribute rankings from Workshop Round 1 ...... 35 Figure 3.6 Product attribute list in Kannada and English ...... 35 Figure 3.7 (Left) Storyboards depicting soil collection methods, (Right) Farmers participating in role play exercise with outdoor props ...... 36 Figure 3.8 Female participants working together to decode colorimetric sensor prototypes ...... 37 Figure 3.9 (Right top) Paper mockups of colorimetric soil testing sensors, (Right bottom) Soil card to interpret color-based sensor reading (Left) Soil card to interpret distance-based sensor reading ...... 38 Figure 3.10 Demonstration of sensor functionality using works-like prototypes ...... 40 Figure 3.11 Aggregate rankings of Input (left) and Output (right) mechanisms of interaction with soil health recommendation database ...... 41 Figure 3.12 Farmers in Hubli, Karnataka reading mock recommendation cards ...... 41 Figure 4.1 Graphical summary of demographics of Ballarawad ...... 51 Figure 4.2 Comparison of Actionability indices 2 and 3 partitioned by Environment ...... 63 Figure 4.3 Comparison of Actionability indices 2 and 3 partitioned by Test Group ...... 63 Figure 5.1 Schema of Soil Health Database ...... 67 Figure 5.2 Variation of mean Actionability scores across Test Groups, grouped by Farmer Typology ...... 71 Figure 5.3 Decision tree algorithm for generation of customized recommendations ...... 72 Figure A.1 Sample recommendation report provided to Control group ...... 80 Figure A.2 Sample recommendation report provided to Treatment 1 group ...... 81 Figure A.3 Sample recommendation report provided to Treatment 2 group ...... 83 Figure B.1 Sample survey sheet for data collection in actionability experiment ...... 85 Figure C.1 Kernel regression outputs of ‘interpret’ on key explanatory variables...... 87 Figure C.2 Kernel regression outputs of ‘ease’ on key explanatory variables ...... 88 Figure C.3 Kernel regression outputs of ‘acquire’ on key explanatory variables ...... 89 Figure C.4 Kernel regression outputs of ‘afford’ on key explanatory variables ...... 90 Figure D.1 Variation of actionability indices with education level...... 91 Figure D.2 Variation of actionability indices with interpret score ...... 92 Figure E.1 Sample recommendation 1 ...... 93 Figure E.2 Sample recommendation 2 ...... 94 Figure E.3 Sample recommendation 3 ...... 95

6

List of Tables

Table 3.1 Product contract describing attributes that determine technical and system design constraints ...... 43 Table 4.1 Indian Census data for candidate villages in Karnataka state ...... 46 Table 4.2 Typology counts per Test Group and Environment ...... 49 Table 4.3 Summary statistics and variable definitions ...... 50 Table 4.4 Statistical comparison of demographics across Test Groups and Environments ...... 51 Table 4.5 Statistical comparison of output variables across Test Groups and Environments ...... 52 Table 4.6 OLS Regression analysis of ‘interpret’ ...... 54 Table 4.7 OLS Regression analysis of ‘ease’ ...... 55 Table 4.8 OLS Regression analysis of ‘acquire’ ...... 57 Table 4.9 OLS Regression analysis of ‘afford’ ...... 58 Table D.1 Variation of actionability indices across Test Groups and Environments ...... 91 Table D.2 Variation of actionability indices across gender and typology ...... 92

7

Chapter 1

Introduction

Agriculture and allied sectors are the backbone of the Indian economy, accounting for 14% of the country’s GDP (MOA, 2013), and approximately 50% of the total employment in the country (MOF, 2015). Yet, while the Indian economy has experienced impressive GDP growth, with an overall growth rate of 6.2% in 2011-2012, the growth in agriculture and allied sectors was a low 3.6% in the same period (MOA, 2013). Further, this growth rate fell by a huge margin in 2013 to just 1.8% (MOA, 2013). This decelerating trend in agricultural growth and contribution to GDP is a worrying phenomenon for a country that represents roughly 2.4% of the world’s area, but supports 17% of its population (MOA, 2013).

Figure 1.1 Agricultural sector contribution to GDP is falling sharply (Source: Hunter 2014 from “GDP at Factor Cost at 2004-05 Prices, Share to Total GDP” Databook, Planning Commission, Govt of India)

As a result, the situation for rural Indian farmers continues to be on the decline. The average landholding size has decreased to 1.15 hectares (Agriculture Census 2010-11) due to high birth rates and generational land redistribution, making it less feasible for farmers to leverage positive economies of scale and adequately invest in their farms. Small holdings farmers in India whose land holdings are less than 2 hectares constitute approximately 80% of all Indian farmers (DOA, 2014 and Singh, 2014). Rural wages have also been particularly stagnant over the last three fiscal years, reaching a low of 3.8% in 2015, lower than the 4.1% increase in consumer price index that year (Haq, 2015 and Damodaran, 2015) indicating a contraction in purchasing power. The extent of the struggle of the rural Indian farmer can be seen not only in the resultant mass migration from rural to urban population centers (Abbas, 2014), but also in the recent surge in farmer suicides, with 12,360 farmer suicides registered in 2014 (Rukmini, 2015).

Looking to the future, the picture only becomes more worrisome. With a burgeoning population expected to increase to 1.75 billion people by 2050 – a 44% increase from India’s current population – and only 12.9% of arable land left to be cultivated (World Bank, 2016), if current agricultural productivity remains the same, India will not even be able to sustain its own

8 population. When taking into account a projected increase in per capita food consumption, as per capita GDP continues to rise, India will likely face this crisis sooner rather than later. Together, these findings highlight the pressing need to improve agricultural productivity so as to increase the quality of life of rural small holding farmers, strengthen the agricultural sector of the Indian economy, and ensure India’s food security for the future.

The State of Soil Health in India

The Green Revolution in the 1960’s and 70’s in India led to vast improvements in income and crop yields due to the introduction of modern methods in agriculture, allowing India to go from being a food importer to a food exporter (IBEF, 2013). However, crop yields in India today remain far below global averages, as depicted in Figure 1.2. Of India’s major crops, only wheat, which is 93 percent irrigated, has average yields close to the world average (USDA, 2015).

Figure 1.2 India’s average crop yields of major crops are far below global averages

With the exception of milk production, Indian agricultural yields remain low compared to world averages, despite growing government subsidies for fertilizer and electricity (USDA, 2015), which underscores the vast potential for increased productivity in the agricultural sector. The primary contributors to poor efficiency in India's agriculture sector can be most accurately categorized as follows: (1) low mechanization, (2) poor disease and pest management, (3) climate change and drought, (4) storage and transportation losses, (5) insufficient irrigation, and (6) poor soil health. For the purpose of this thesis, we will focus on the latter.

Soil health can be quantified by the relative macronutrient and micronutrient concentrations in soil. Macronutrients refer to the nitrate, phosphate, and potassium (i.e. N, P, K) ion concentrations in soil. The pH of the soil is another key indicator of soil health, often measured along with macronutrient concentrations. Micronutrients refer to nutrients at lower concentrations, specifically, calcium, boron, zinc, copper, iron and more. The current state of Indian soil health is poor. Skewed values of macro and micronutrients are ubiquitous, water quality is declining, and organic carbon content has also diminished (Patel, 2015), leading to an overall decline in soil

9 health. Government interventions to improve food production, such as fertilizer subsidies and financing schemes, have had the unintended consequence of incentivizing the imbalanced use of fertilizers. The result is that while the ideal NPK fertilizer consumption ratio is 4:2:1, the actual NPK use ratio varies widely across India. Specifically, Indian farmers tend to overuse urea as it is highly subsidized compared to other macronutrient fertilizers. On average, farmers apply twice the urea recommended, and in some northern states, farmers apply closer to 10-15 times more than the requirement (Patel, 2015). From a farmer’s perspective, this problem is doubly compounded; as it is not only an unnecessary increase in capital expenditure, but is also a detriment to their soil health. From a macroscopic perspective, overuse leads to poor crop yields for India as a whole, in addition to the contamination of downstream groundwater sources, which can cause deleterious public health effects resulting from water eutrophication (Good, 2011).

The Need for Information on Soil Health

Providing accurate information about a farmer’s soil chemistry would go a long way in curtailing the negative effects of improper fertilizer use. We posit that if more farmers were able to test their soil effectively and receive actionable advisory, they would be better able to make informed fertilization and irrigation decisions, thereby increasing their productivity and profitability.

For comparing different available soil testing methodologies, we provide a framework for assessing an ideal soil testing solution. We define that such a solution must fulfill the following attributes:

1. Availability - the solution must be available to the farmer, ideally at the point-of-use, without the need for excessive travel 2. Affordability - the solution must be easily affordable for an individual farmer or a small group of farmers 3. Usability - the solution must be easily used and intuited without prior knowledge of soil chemistry fundamentals 4. Actionability - the solution must provide actionable feedback to the user in terms of quantifiable steps towards soil health improvement

Soil testing in India currently takes place either in government labs, mobile soil-testing units, or by the use of field kits (DOA, 2011). Government sanctioned farmer extension service centers, called Krishi Vigyan Kendras (KVKs) are mandated to have an active soil testing facility on site, where farmers can drop off soil samples and receive a report on their soil chemistry. Most often these soil tests are available at a subsidized cost (typically ₹50-400 or $1-$6 USD), based on data from surveys. The problem with the KVK system however, is four-fold:

 Farmers have to travel long distances to reach a soil lab, thereby sacrificing a day’s worth of labor and/or wages.  The time delay between sample collection and receipt of the soil report, often more than weeks or months after the planting season, renders the report useless.  The widespread inefficiency and mismanagement often leading to missing or incomplete soil reports.  There are 514 government soil testing laboratories in India with a capacity of about 6.5 million samples per annum (Dept. of Farmer Welfare), which falls far short of the need. At only 1 KVK per 216,000 operational holdings, the shortage of staff to accommodate

10

demand and upkeep an efficient soil lab testing facility is an enormous constraint.

Thus, while KVKs are affordable to rural small holdings farmers, they fail in terms of poor availability and minimal actionability.

Mobile units and field kits are available to farmers in India for approximately ₹35,000 (equivalent to $500 USD), and include a set of extraction solutions, beakers, and colorimetric reactive dyes. With respect to cost, such a solution is far beyond the purchasing power of rural small holdings farmers, who make on average ₹270 per day (Haq, 2015). Instead, these mobile soil testing kits are often used by those with greater individual or aggregate purchasing power, such as field agents hired by large scale contractors, farmer cooperatives, academic institutions, and agricultural NGOs. Even if a farmer could afford such a solution, these kits involve multi-step chemical extraction procedures and assays, which often require prior chemistry intuition to successfully receive an accurate result. As such, while the mobile soil testing kits are more readily available to farmers, they fail in their affordability and usability.

Understanding soil chemistry, and therefore treating the soil with the correct type and amount of fertilizers, is a key step towards improving crop yields while protecting soil health. This research aims to address this challenge of poor soil health and information by creating a novel low cost soil-testing sensor to measure soil nutrient concentrations1, which works in conjunction with a robust information feedback system to provide farmers with timely actionable information about their soil. The ultimate goal is to empower the farmer with information about his soil and crops, so that he can make informed fertilization and irrigation decisions to improve crop yields.

Thesis Overview

The remainder of this thesis is laid out as follows. In chapter 2, the research question guiding this thesis is presented, along with an overview of the mixed methodology approach used to answer the question. Chapter 3 provides a detailed description of the various techniques employed to understand the problem area in greater depth, which provides direction for the subsequent analysis performed. Chapter 4 describes a field experiment conducted in a village in India in January 2016, the analysis of resulting data, and the formulation of a numerical index measuring the actionability of simulated soil health recommendations. In chapter 5, lessons from the analysis in the preceding chapter are used to inform the design of a soil health recommendation database for the creation of user-centric, customized soil health recommendations. Concluding remarks and potential future work are discussed in chapter 6.

1 Soil sensing technology is being developed by Ron Rosenberg at the Mechanical Engineering department at MIT, co-researcher and Tata Fellow on this project, and co-author of this introductory chapter.

11

Chapter 2 Methodology Overview & Research Question

In order to understand the mixed methodology and analysis described in Chapters 3, 4, and 5, a brief overview of methods is described. Section 2.1 covers the research question that guides the work described in this thesis. Next, section 2.2 provides an overview of the various methods used to uncover system design inputs from a range of stakeholders and sources.

2.1 Research Question

The following question guides the development of a soil health recommendation system for small holder rural Indian farmers developed later in this thesis.

“What constitutes an actionable information system for resource-poor Indian farmers?”

Actionability is defined as that property that enables the farmer to interpret and act upon soil health recommendations 100% of the time. As a system property, actionability is distinct from and goes beyond usability. In his pivotal book Usability Engineering, Jakob Nielsen defines usability as “a quality attribute that assesses how easy user interfaces are to use”. Nielsen further expands this definition to include the following five quality components that determine the level of usability of a system or interface: (i) learnability: how easy it is for users to accomplish a task on first encountering the system, (ii) efficiency: how quickly users can perform tasks, (iii) memorability: the ability of users to reestablish proficiency with a system, (iv) errors: the rate and severity of errors that users make, and (v) satisfaction: the level of satisfaction of users on interacting with the system. Success in addressing the problem of poor soil health and information in the context of rural India will require a system design for recommendations that optimizes all of the above criteria. In other words, a recommendation system that optimizes all the quality components listed will enable maximal interpretation of recommendations by farmers. However, in order to be useful to the target demographic – small holder farmers, the system will have to go beyond usability alone, and accommodate considerations for stimulating user action beyond the direct interaction of the user with the recommendation system. Actionability therefore integrates considerations of usability with a system design that is intended to stimulate user action.

2.2 Methodology Overview

A number of different methodological approaches to answering the research question previously stated are described in this section.

2.2.1 Field Research

The target beneficiary of the soil testing solution under development is the smallholder Indian farmer. In order to sustain a user-centric design process, numerous exchanges occurred with smallholder farmers as well as other stakeholders in the soil health value chain during all phases of the solution design. These interactions took place in various locations in India, and in a variety of formats such as interactive workshops, focus groups, and interviews. Field research methods

12 are described fully in Chapter 3.

2.2.2 Policy Analysis

The agriculture and allied sectors in India are a substantial employer, accounting for about 50% of the workforce in the country. However, the share of the agriculture and allied sectors in GDP has been consistently declining, falling from 23.2% in the year 2000 to 13.9% in the year 2014 (MOF, 2015). The Ministry of Finance identified low agricultural productivity as one of the primary reasons for the inability to sustain high economic growth in the country (Economic Survey 2013-14). It has also generally been observed that the shares of GDP and labor in agriculture decline as countries develop. In terms of percentage share of employment in agriculture in the 2000s, India is surpassed by only a handful of poorer African countries, namely Ethopia, Mozambique, and Burkina Faso (Roser, 2015); an indication that the potential for further development in the agriculture sector in India is enormous. An examination of the effects of various national and state level agricultural policies on farmers, and an assessment of the successes and failures of these programs as they relate to soil health and productivity of farms in India was therefore called for. An analysis of Indian agricultural policy to gain a deeper understanding of its impact on farmers is laid out in Chapter 3.

2.2.3 Actionability Experiment

An understanding of user needs and policy gaps led to a preliminary set of hypotheses about the information content and design that allow for maximal actionability for farmers. An experiment was run with farmers in a village in South India, the purpose of which was to try and elucidate the various factors that affect actionability as it pertains to soil health advisory in a resource-poor setting, via direct interaction with the user. A description of this experiment as well as an analysis of the resulting data is laid out in Chapter 4.

2.2.4 Database Design

Results from the actionability experiment directly feed in as design inputs to a recommendation generation engine, the heart of which is a database containing information on soil health (macronutrient) indicators accompanied by recommended corrective measures for different levels of nutrients and crops. The raw information contained in the database is a collation of data from the variety of rich resources of the Government of India – State agricultural universities, agricultural extension services, and independent research institutions. Chapter 5 describes the design of the schema of the database core, and the recommendation generation algorithm used to tailor information to user specific needs.

13

Chapter 3 Problem Finding

This chapter describes the various methods followed in order to understand the nature of the problem area, perform an assessment of user needs, and explore various stakeholder perspectives on the problem of poor soil health and information in India. The first section examines and synthesizes the contemporary literature that focuses on soil health and sustainability, agricultural productivity and rural development, the use of technology in agriculture, and the impact of advisory on consumer behavior. The main objective of this literature review is to garner a precise understanding of what has already been investigated, and to understand what implications the findings of those investigations will have with regard to the value-enhancing capabilities of a soil testing service in rural India. The next section contains a detailed analysis of the various “gaps” or failures of the soil health and fertilization system in India, from a number of perspectives. First, an examination of effects of various national and state level agricultural policies on farmers is done, followed by an assessment of the successes and failures of these programs as they relate to soil health and fertility of farms in India. This is followed by a section on user needs, co-authored with Ron Rosenberg, Tata Fellow in Mechanical Engineering, and fellow researcher on this project. The user needs gap section is a detailed synthesis of the methodology we used to engage various stakeholders in the soil health value chain in India, as well as design considerations for the system that result from these interactions. A distillation of findings into a product contract follows. The chapter concludes with a synthesis of these various analyses that develops into the specific research question targeted in this thesis. 3.1 Review of Literature

Technology in agriculture The system design complementary to a new technological solution is critical to the successful adoption and implementation of the technology. Multiple studies have been conducted on the use of information technology to increase economic development in resource constrained settings. A number of different approaches have been tried, with different end goals in mind. In a 2004 study based in North India, researchers carried out a comparative assessment of two initiatives carried out by two different organizations – TARAhaat and Drishtee.com (Kaushik and Singh 2004). TARAhaat’s social mission was to create sustainable rural livelihoods through IT- enabled education, while Drishtee’s aim was to enable internet based government services (Kaushik and Singh 2004). Both organizations initiatives were compared on the basis of several indicators including their ability to scale, penetration of selected social segments, revenues per month, government involvement, and marketing of their initiatives (Kaushik and Singh 2004). One of the preliminary findings of the comparative study was that the availability of communications infrastructure and financing acted as constraints in both cases (Kaushik and Singh 2004). Any technology based innovation will be heavily dependent on a robust communications infrastructure, which needs to be carefully considered prior to implementation. Another important finding from this study is “the importance of private institutional innovation, not just government policy, as a necessary vehicle for the diffusion of technological innovation” (Kaushik and Singh 2004). In terms of implications of these findings on this research, this

14 highlights the importance of strong partner organizations for technology diffusion on the ground in India, as well as encourages minimal reliance on government organizations as far as possible. A study carried out in Madhya Pradesh in central India examined the financial sustainability of a large network of village internet centers (Kumar 2004). A crucial constraint identified in this study corroborates the findings of (Kaushik and Singh 2004), that is that maintaining connectivity on the available communications infrastructure in India was a major challenge. The connectivity issue notwithstanding, the author believed that “[information technology] will play a vital role in disseminating agricultural best practice information and connecting farmers to agricultural scientists for consultancy” (Kumar 2004). A large dependent variable for this research is the robustness of the communications network (a technological network or not) used to reach out to farmers in a two-way manner, which might prove to be a considerable challenge. A significant benefit that the proposed solution would bring to larger corporate or government entities with an interest in the changes in soil health over time is the generativity of the system, that is, how the system can produce value over time by improving information about soil health in a region. A 2007 study on the role of ICT in agricultural development identified knowledge management as a critical piece that promotes development from the dual perspectives of “raising rural incomes and ensuring the sustainability of natural resource base of agricultural production for the growing populations and incomes” (Rao 2007). Numerous studies (substantiated by the author’s firsthand experience working with farmers in India) indicate that there is a large volume of traditional indigenous agricultural knowledge present in rural farming communities. The use of ICT is one potential method for the preservation of this type of valuable knowledge (Puri 2007). Another observation from Rao’s 2007 study is “[the need for the] development of institutional environments for the creation and delivery of information and knowledge to the end users” (Rao 2007), which is supported by the previously discussed findings of Kaushik and Singh. Soil health There are numerous indicators of soil health, including but not limited to concentrations of macro and micro nutrients, physical parameters, and biotic components of soil. Pertaining to this, Doran and Zeiss conducted a study of soil quality and soil health indicators, within the larger goal of evaluating sustainability of agricultural practices. According to their study, “any indicator of soil health or soil quality should meet the following five criteria: sensitivity to variations in management, well correlated with beneficial soil functions, useful for elucidating ecosystem processes, comprehensible and useful to land managers, easy and inexpensive to measure [listing paraphrased]” (Doran and Zeiss 2000). Our focus on the measurement of macro nutrients for this research project is on the basis of the established correlation with beneficial soil functions, that is, the established correlation between macro nutrient concentrations and plant growth. This satisfies the second criterion outlines by Doran and Zeiss. Of interest from the system design perspective is the fourth criterion – “comprehensible and useful to land managers”. In their paper, the authors refer to biological indicators of soil health and note that “considerable thought and creativity are required to develop measurements of soil organisms that are comprehensible and useful to land managers” (Doran and Zeiss 2000). This observation holds true when extrapolating to other indicators of soil health, including macro nutrient concentrations. It is therefore of consequence to focus specifically on system parameters that allow land managers and/or farmers to easily and comprehensibly interpret soil test results that indicate the health of their soil.

15

In another study on soil health and sustainability, J. W. Doran explores the translation of scientific data on soil into actionable tools for practical use. He makes an interesting observation with regards to the modernization of agriculture, and the tolls this has taken on the land: “Modern agriculture has developed into a high technology and high inputs industry that has met the increasing needs of an ever-growing human population. However, this “industrial” system of agriculture increasingly results in reduced net economic returns to farmers, taxes the resilience of soil, stresses our natural non-renewable resources, and increases the potential for environmental pollution” (Doran 2002). In identifying indicators of soil health, it is essential to take into consideration those factors that characterize the “modernity” of soil health – skewed nutrient ratios, contemporary pesticides, and chemical contaminants, when assessing soil health with relevance to modern times. In his paper, Doran suggests that in order to remain relevant, “indicators of soil health […] must be linked to the development of management systems that foster reduction in the inputs of non-renewable resources, maintain acceptable levels of productivity, and minimize impact on the environment” (Doran 2002). Impact to agricultural productivity and rural development This section examines and synthesizes learnings from studies in agricultural rural development, particularly those that focus on the characterization of sustainability and development as system properties in agriculture that guide and shape its progress as an industry. A study by Hansen on the concept of sustainability as a basis for guiding change in agriculture asserts that identifying system indicators that are consistent, relevant, and focused on aspects of system performance that influence sustainability is a challenge (Hansen, 1995). In simpler terms, Hansen implies that using sustainability as a guiding force for improvements in agriculture may prove inadequate, by virtue of the challenge in identifying system properties that adequately represent measures of sustainability in agricultural systems. In order to be able to measure, and thereby guide improvements in agricultural development, there is a need for indicators that adequately capture the correct system parameters, while simultaneously lending themselves to consistent and focused measurability. The benefits of the Green Revolution in terms of agricultural development in Punjab are well known, and well understood. In his 2014 paper, Kamaljit Kaur Sangha explores the adverse impacts of the modern agricultural systems that were introduced in Punjab as part of the Green Revolution, specifically the overuse and misuse of land and water resources and loss of biodiversity (Sangha 2014). Sangha cites other studies on the sustainability of global agriculture: “About half of the global usable land is already under pastoral or intensive agriculture and many of the natural resources such as soil and water have been exploited to their maximum potential (Tilman et al., 2002)” (Sangha 2014). Currently, Punjab has more than 80% of its land under cultivation, which indicates a limited scope to grow or even sustain the current level of agricultural productivity. In his paper, Sangha focuses on a range of specific issues that constitute the negative consequences of the Green revolution in Punjab. They include the loss of crop and wild plant diversity, socio-economic and health impacts, and pollution and depletion of water and soil resources among others. In order to tackle these issues, Sangha suggests the implementation of a sustainable system of “Scientific and Traditional knowledge (ST) […] that can enhance resource use efficiency and crop production for sustainable farming systems” (Sangha 2014). His primary thesis is to draw focus away from increased crop yields, and instead focus attention on the broader picture of sustainability of current agricultural practices. Relating this back to implications for this research; this introduces another perspective within the ultimate

16 goal of improving farmer welfare, that is, the overarching goal of sustainability of agricultural practices as they relate back to farmer welfare in the long run. While this is not necessarily the primary objective of this work, this plays an important role in determining the type of actionable advice given to farmers as part of the program on improving soil health. The Impact of Advisory on Consumer Behavior In a 2011 study reviewing the impact of drinking water contamination related advisory on consumer behavior, Lucas et al. indicate that intervention with information about water quality does seem to promote behavior change in rural communities, however they note that their ability to draw robust strong conclusions is limited by the evidence collected in the field (Lucas et al. 2011). A 2012 J-PAL paper that draws on evidence from an experiment in rural Andhra Pradesh in which the effect of water quality testing was measured supports this observation, noting that interventions in this space “should not overlook the impact that personally tailored information can have” (Hamoudi et al. 2012). Lucas et al. highlight four key considerations for the elucidation of relevant behavioral effects: 1) The need for evidence of impact using robust methods (ex. random allocation of study participants, use of non-intervention control groups), 2) The format of information provided, 3) The methods of information dissemination, and 4) The use of community level interventions and outcomes. A notable lack of literature in this area indicates that the strength of evidence to support the value of dissemination of health-related advisory is low. These observations can be easily extrapolated to soil health related advisory, for which evidence on user impact is similarly scarce, indicating a clear need for rigorous experimental studies to determine the nature and extent that advisory (specifically soil health related advisory) affects user behavior, particularly when these advisories are custom tailored to the user group in question. 3.2 Gap Analysis

This section contains an assessment of the various gaps in policies supporting soil health, and user needs for a solution in this area, as investigated using a mixed methodology approach. Sub section 3.2.1 examines the state of federal and state level policy in India, for the purpose of assessing the successes and failures of these programs as they relate to soil health and fertility of farms in India, and subsequently laying out some policy suggestions to address concerns on poor soil health, and shift focus to more practical and sustainable means of soil health improvement. Sub section 3.2.2 is co-authored with Ron Rosenberg, Tata Fellow in the Mechanical Engineering Department at MIT, and fellow researcher on this project. It describes the range of field research methodologies followed to uncover user needs and design consideration from a range of stakeholders in the soil health and testing value chain in India. Part of the work described in this sub-section was conducted in collaboration with Jasmine Florentine, a Tata Fellow in the Mechanical Engineering Department at MIT. The assessment of user and policy gaps detailed in this section allows for a deeper understanding of failures in the numerous systems that interact, directly or indirectly, with the small holder farmer to address his need to improve the health of his soil. The section concludes with a distillation of lessons from the various problem finding methodologies into a product contract, which translates user needs into specific product attributes incorporated into the design of the device and system.

17

3.2.1 Policy Gap

Introduction to Policy Framework

In the year 2000, the India government promulgated the National Agricultural Policy (NAP) that outlined a comprehensive set of objectives for the important sub-sectors of agriculture. The policy aimed to enable a growth rate of 4% per annum in the agriculture sector over the subsequent two decades, through the efficient use of natural resources, strengthening of rural infrastructure, promotion of value addition, growth of rural employment, and a number of other measures (National policy for farmers, 2007). NAP detailed a set of extremely ambitious goals, and was critiqued for lacking an elaborate strategy by which those goals could be achieved (Chand 2004). Unsurprisingly, a large proportion of stated goals remain unfulfilled years later, as is characteristic of several policy initiatives in agriculture in India.

Acknowledging the gravity of the problem of poor soil health, the Government of India recently launched a program to issue soil health cards to 140 million farmers across India over a period of three years (PM India, 2015). Within half a year of the launch of the Soil Health Card program, it became clear that the original goal was unachievable, with only 40% achieved of the first target of 8.4 million cards (Parsai, 2015). The incapacity to distribute generalized soil health cards to farmers, let alone customized recommendations based on individual soil tests is indicative of the extent of the problem, and the need for creative policy solutions to address the current issue of poor soil health.

This analysis examines the effects of various national and state level agricultural policies on farmers, and assesses the successes and failures of these programs as they relate to soil health and fertility of farms in India. Subsequently, some policy suggestions are laid out to address concerns on poor soil health, and shift focus to more practical and sustainable means of soil health improvement.

A History of Agricultural Policy

The period from 1950 to 1965, also known as the Pre-Green Revolution Period in India, saw the introduction of a number of policy measures aimed at enhancing food production and improving food security (Arora 2013). The major piece of legislation introduced at this time was the Zamindari Abolition Act, implemented at the state level, and aimed at eliminating land intermediaries, ensuring ownership rights to farmers and tillers of land, and ensuring a permanent improvement in the quality of the landholding (Arora 2013). The abolition of the oppressive Zamindari system led to 20 million statutory tenants acquiring occupancy rights, leading to a considerable increase in the area under the owner-operated system (Rao 1996). Further, state level laws were enacted to improve land ownership policies to ensure greater equity, which included laws on the control of unused land, distribution of unused land to the underprivileged, and mechanisms to ensure the retention of land holdings to recipients. A ceiling was placed on the size of land holdings, and consolidation of fragmented land was encouraged in order to better leverage mechanization for land improvement (Arora 2013).

The successes of these ambitious reforms were significant and measurable. Over two million agricultural cooperatives came into existence, and the credit provided by them increased from 8 per cent of total borrowings of cultivators in the early 1950s to 30 per cent by the mid-1960s. Food

18 grain production increased, and consequently the prices of cereal declined sharply during the 1950’s. Further, the gross irrigated area increased from 22.6 million hectares in 1950-51 to 32.7 million hectares in 1966-67 (Rao 1996).

A notable failure of political reforms in the pre-green revolution era was the lack of policy drivers for technological growth. Fertilizer consumption per hectare, which is a suggestive index of technological change, was only 7 kg per hectare in India in 1966-67 (Rao 1996), as compared to ~43 kg per hectare during the same time period in the United States (Roser, 2015). It would be reasonable to assume that slow behavioral changes coupled with inadequate new technologies on the market were responsible for this weak growth, however there are indicators that larger policy goals played an important role. In 1954, the US Agricultural Trade Development and Assistance Act was signed into United States federal law, with a purpose “[t]o increase the consumption of United States agricultural commodities in foreign countries, [and] to improve the foreign relations of the United States” (Pub.L. 83–480, enacted July 10, 1954). This Act, also known as PL480, made available abundant food grains on concessional terms in the Indian market. Rather than incentivize farmers in the country to improve production rates, policy makers in India chose to leverage the new opportunity to import food under PL480 to satisfy the policy goals of improved food security.

The second phase in agricultural policy came about during a severe food crisis, and the recognition that continued reliance on food imports and aid imposed a heavy cost in terms of political pressures and economic instability in the country (Rao 1996). As a result, in mid-1960 the focus of policy makers shifted towards an improvement in domestic production as a means to improve food security; a period known as the Green Revolution. The adoption of improved crop technologies and seed varieties was the main source of growth during this period. The Government of India chose a strategy that involved the rapid dissemination of high yield varieties of rice and wheat, coupled with an intensification of fertilization and irrigation. Unsurprisingly, the results of this strategy were dramatic and swift, with massive jumps in observed yields. Production of wheat, which had grown sluggishly during the pre-green revolution period, more than tripled in a period of 5 years, reaching 26 million tons in 1971-72 and further rising to 36 million tons by the late 1970s. The growth in production of rice was similarly rapid, increasing by a margin of 23 million tons between 1966 and 1979 (Rao 1996).

The successes of the Green Revolution are numerous and well known, one primary victory being India’s achievement of self-sufficiency in food production. Between the periods 1966-70 and 1976- 80, the per annum net availability of food grains increased from 82 million tons to 104 million tons and gross output of food grains from 87 million tons to 120 million tons. Additionally, India went from importing 6.4 million tons per annum between 1966 and 1970, to zero imports between 1976 and 1980 (Rao 1996). This phase also saw a greater adoption of technology-based solutions in agriculture, which further led to opportunities for private investment in the agriculture sector. The use of NPK fertilizers rose sharply during this period, as did the use of herbicides and pesticides. The gross irrigated area in the country rose from 33 million hectares in 1966-67 to 52 million hectares in 1983-84; over the same period, fertilizer use increased from 7 kg/ha to 45 kg/ha (Radhakrishna in Rao, 1996). This technology-based approach to improving food production was also accompanied by policy measures to provide food and employment security to the poor, via the creation of institutions such as the National Bank for Agriculture and Rural Development (NABARD) and Regional Rural Banks (RRBs) to strengthen credit sources for

19 farmers (Arora 2013).

After 1980, in spite of a slowdown of agricultural policy initiatives, growth in the rural economy continued to grow endogenously. The average growth rate of production, at 3.21% per annum in between 1980 and 1991, was the highest it had ever been, exceeding even the 2.38% per annum growth rate observed during the green revolution period. Further, this growth was attributed mainly to yield improvements due to technological growth, rather than expansion in area of production, which had been a significant factor affecting growth in earlier years (Rao 1996). Evidently, the production gains from the Green Revolution era had continued through the 1980’s. The importance of fertilizer input to agricultural production was emphasized during this period, which generated a strong national policy push to promote fertilizer and other agricultural input use. In 1990 alone, agricultural subsidies (in the form of farm input subsidies) in India totaled $3.8 billion dollars, which increased steadily through the 1990’s, reaching $9.3 billion in 2004 (OECD in Ashra and Chakravarty 2007).

Evolution of the Indian Fertilizer Industry

The fertilizer industry in India made a very humble beginning in 1906 with the setup of the first manufacturing unit for Phosphate fertilizer in , with a capacity of 6000 metric tons per annum (Dept. of Fertilizers 2012-13). In subsequent years, the green revolution gave impetus to the industry to grow an industrial base to achieve and sustain self-sufficiency in food grains.

The first subsidy act passed by the Government of India was the Retention Price Scheme (RPS), which was introduced for nitrogenous fertilizers in 1977 and remained in force till 2003. Under the RPS Retention Price was fixed for each unit by the Govt. The difference between the Retention Price of urea and the maximum retail price of urea was paid as subsidy.

Unsurprisingly, a number of studies have shown that lowering the costs of inputs essential for agricultural production such as fertilizers and pesticides, has encouraged farmers to use more than they need. Further, although increased fertilizer consumption and improved crop yields were undoubtedly correlated between 1960-90, the latest trends in crop response demonstrate a drastic decline in yields from 1990-2010 (Praveen 2014).

Negative Externalities in the Agriculture System

The sweeping changes in national agriculture that were brought about by the Green Revolution resulted in several negative downstream impacts. One study from the state of notes that the intensified use of chemical inputs during the Green Revolution has resulted in “continuous environmental degradation, particularly of soil, vegetation and water resources” (Singh, 2000). Notably, many degradation effects were not apparent at the time, and we are only now understanding the adverse consequences of increased fertilizer and pesticide use. Crop varieties introduced during the green revolution were responsive to chemical inputs, which necessitated both increased fertilizer application and use of irrigation, resulting in water contamination by nitrate and phosphate and changes in the ground water table. Over time, declining nutrient-use efficiency combined with physical and chemical degradation of soil have been seen to limit crop productivity (Singh, 2000). In step with the observation of ecological impacts that intensify over time, some studies have even found it to be more profitable for a farm to convert conventional crops to organic ones (Vasile et al. 2015).

20

The Indian national economic survey from 2014 states that “the decisions of the government regarding subsidy on inputs for agriculture including fertilizer and increase in the minimum support prices (MSP) could also have an impact on food inflation” (MOF, 2015). Further, productivity levels in Indian agriculture are still much lower than global standards, with growth levels of rice and wheat stagnating since the 1980’s (MOF, 2015). Small and marginal farmers have a larger share in fertilizer subsidies as against their share in the total cultivated area, thus any cut in fertilizer subsidies will hurt them the most (Arora 2013). This problem is further compounded by the fact that only 30% of agricultural subsidies reach deserving beneficiaries (Arora 2013). The problems of poor crop yield and productivity, and over fertilization resulting from subsidies are clearly closely linked. It is therefore logical to conclude that the benefits of technological solutions in one area could spill over into the other. Further, these solutions can either be supported or impeded by agricultural policies, depending on the long term vision with which they are designed.

Policy Recommendations

Alleviating the current lock in to urea in the country is the underlying strategy by which the problem of rampant over fertilization (particularly of nitrogenous fertilizers) must be approached. The fertilizer industry, although humble in its beginnings in India, is a powerful and influential force in this area, and a direct beneficiary of the lock in to urea that was brought about by the Green Revolution. Since subsidies cannot be easily abolished without adversely affecting the welfare of its beneficiaries, it is of interest to consider political means of encouraging industry participation in the production and mixing of fertilizers in such a way that would enable more precise farming than we have currently. Industry participation in the analysis and tracking of soil health over time would serve the dual function of improving profitability by enabling the customized provision of fertilizers, while enabling more informed and targeted fertilization decisions to be made by farmers.

The level of policy implementation is also an important consideration. State level policies are effective for governance at a high level, while district level policies make sense for implementation. This is a direct implication of the poor performance of prior policies such as the NAP, which were seen to fail due to insufficient level of planning for implementation strategies. Adequate capacity for organizing and managing interventions is also a concern. The Soil Health Card scheme is indicative of a realization on part of the government of the need for information on soil health. However, the current lack of adequate capacity for testing as well as distribution has limited its success. A point-of-use device that would enable farmers to test their own soil, accompanied by a system for generating individualized recommendations would go far towards fulfilling this need for information on soil health.

Lastly, policies in this area may see more success if they are targeted at specific farmer “types”. Differential treatment of farmers on the basis of a number of socio-economic indicators such as education level, land holding size, gender, and age will allow for benefits to accrue to the subset of the demographic that are in most need of it. A potential application of this model would be the phased abolishment of the urea subsidy, starting with large to medium size farmers and subsequently moving down the pyramid. The ultimate aim is to funnel government support and funds to the bottom of the pyramid. The simultaneous introduction of alternative support policies, timed strategically, would then allow for some flexibility in shifting the market towards

21 more environmentally sustainable practices while retaining economic stability, such as organic farming.

3.2.2 User Needs Gap

The following section details the range of field research methodologies followed to uncover user needs and design considerations from the target end user/beneficiary: the small holder farmer. A number of interactive activities were designed to elicit specific product and system related design inputs from the user. Further detailed information was garnered from group and individual level interviews. At each stage, specific lessons emerged which are emphasized in the descriptive text, and finally brought together into a “product contract”, which describes how each takeaway translates into product attributes relevant to the design of the device and system. Figure 3.1 below figuratively depicts the design flow described in this section, which includes the translation of data from user inputs into design requirements that ultimately guide the design of the sensor and recommendation system.

Figure 3.1 Design flow depicting translation of user inputs into design requirements

3.2.2.1 Stakeholder Interviews

Interviews were conducted with a range of stakeholders in the soil health value chain. We selected a diversity of users in order to inform our design practices for the technology as well as the accompanying recommendation system, such that the final product meets the diversity of user requirements. The list of stakeholders interviewed included farmers, government soil testing lab (KVK – Krishi Vigyan Kendra) representatives, corporate farming entities or aggregators such as sugarcane cooperatives, and scientists from agricultural research institutions. 1. Farmers

The small holder Indian farmer was our primary and most important stakeholder for this research, as he/she would be the ultimate beneficiary of the downstream benefits of a robust sensing technology and recommendation system under development. We conducted between

22

120 and 140 interviews2 with farmers in villages in the region in and around Hubli of North Karnataka, as well as the villages in the region close to Dehradun in the northern state of Uttarakhand. Different geographic locations were selected to accommodate differing views on problems related to agriculture that characterizes different agro-ecological zones across India. Interviews were designed with the following principles in mind, in accordance with the human- centric approach to design: [A] Location: The majority of interviews were conducted either at the farmer’s house or field, such that they would be in a comfortable and non-threatening atmosphere enabling them to be open and honest in their answers to questions. [B] Environment: When possible, we made sure to avoid overwhelming interview participants by outnumbering them. Further, each research team member was given a clear and specific role to perform prior to the start of the interview, such as interviewer, note-taker, translator, photographer, etc. Streamlining the process in this way enabled us to be more efficient in our use of time in villages, and therefore maximize the number of interactions possible. [C] Preparation: Questions were prepared beforehand, and modified based on the first set of interviews conducted in a region, such that they were reflective of user needs and concerns in that region. One notable observation is that participants found abstract questions, or those that involved hypotheticals the hardest to answer in a satisfactory way, so questions were designed to be as specific and concrete as possible. The interview format started with simpler more direct fact-based questions that users would find easy to answer, followed by more complex or opinion- based questions later on. Questions were designed to understand general problems that farmers face, perceptions towards soil testing, the value chain that makes soil testing equipment, service, and training available to farmers, and the value chain that makes soil health related recommendations actionable to farmers. The following list is an example of a subset of questions from interviews conducted in north Karnataka in January 2015, that specifically relate to soil testing and fertilization: 1) Do you test your soil? When and why did you first start, how often do you test, and describe the process followed. 2) Do you value soil testing? Do you think it’s a good thing to do, even if you don’t do it? 3) If you don’t soil test, why not? 4) Do others in your community test their soil? What is their perception? (who, when, why, how often) 5) What’s difficult about soil testing? What do you like about current soil testing practices? 6) How much would you currently have to pay for a soil test? How far would have to travel? 7) How would you make soil testing different? 8) How do you get your information about soil best practices such as irrigation, fertilization, sowing, and tilling? 9) Do you fertilize your soil? Where do you get it from? How much do you pay? What kind/brand of fertilizer you use?

[D] Observation: An observation protocol was put in place prior to starting the interviews, which

2 Imprecise number due to the nature of conducting interviews in this context – people come and go, and family members and neighbors join in with answers to some interview questions not specifically directed at them.

23 included specific directives for note takers to write down observations on body language as well as actual responses. Translators were prepared beforehand to convey direct quotes from farmers rather than paraphrased versions of answers, so that valuable user opinions would not be lost in translation.

Figure 3.2 (Left) Interview with farmers in Hubli region, Karnataka, (Right) Interview with farmers in Dehradun region, Uttarakhand

A number of interesting lessons emerged from interview data. It became quickly clear that farmers in different regions in India face very different agriculture-related problems, and therefore prioritize concerns about soil very differently. South Indian farmers in the Hubli area listed water scarcity and access to power for irrigation as their two main concerns, while North Indian farmers in Uttarakhand had an entirely different set of concerns – while water was not a big issue for them, the prevalence of pests and spoilage of farm products due to poor access to transportation in hilly regions were their main concerns.

Takeaway: Soil health is of varying priority for different types of farmers, based on their location, access to commodities, and socio-economic status.

Less than 5% of farmers interviewed had ever performed or attempted to perform a soil test in the past. Of those that did collect a soil sample and mail it to the nearest KVK, we did not come across a single person that received a soil health report in return. For this reason among others, the level of trust in the government process was seen to be low, and farmers revealed that they would much rather test their own soil if they were able to rather than send a sample out for testing. Our takeaway from these results was two-fold. First, this emphasized the importance of the point-of-use aspect of the device, as well as the need for the accompanying recommendation system to generate results immediately, and effectively close the loop right at the farmer’s land.

Takeaway: The point-of-use aspect of the device and system is a key value add.

Second, this highlighted the importance of having a trusted brand name associated with any technological intervention in this space, which was already characterized by poor response-times and low levels of user trust. Going forward therefore, it would likely be favorable to approach commercialization through a partnership with a private or non-governmental institution in India, such as the Deshpande Foundation or the Himmothan Society that we worked with, who had

24 already built a trusted reputation with farmers in their respective areas.

Takeaway: Brand value is important to our end user, so associating with a trusted entity is critical to the success of our system.

While most farmers reported that government representatives did not frequently visit their villages to offer support, some mentioned that they personally traveled to district offices to obtain information about fertilizers from them, as well as information on related subsidies. Poor access to fertilizers and subsidies was a common problem highlighted. Data from interviews corroborated policy findings that the delivery of subsidies is a political process fraught with corruption, resulting in subsidies not reaching those who need it the most. Farmers with wealth and political connections were reported to be the ones to whom subsidies, seeds, and other inputs were made the most available. The demand for bribes from fertilizer distributors was also reported to be a common occurrence. Distributors were reported to falsely limit supply when demand for fertilizers increased (when it rained for example), supplying farmers who could pay the highest bribes, those with political connections, as well as those who required larger quantities at a time. A consideration of access to inputs was therefore an additional concern that would directly affect the success of an intervention in this space. Interview data also revealed that farmers gain awareness of how to improve soil fertility from liberal and/or best-performing farmers in their area. Sharing of information such as best-practices and rules of thumb within villages were found to be highly characteristic of farming communities in both the north and south of India. Families also tended to work together on their plots of land, and neighbors helped each other out with labor-intensive processes such as weeding and harvesting. The close knit nature of the majority of farming communities we observed was a clear indication that a successful technological intervention would need to take into account the importance of social networks in agricultural communities.

Takeaway: A good system design will incorporate a means for farmers to work with each other, as this reflects their natural social dynamic.

Many farmers who were interviewed revealed that while they had a good perception of the utility of soil testing in general, they did not know much about the specifics – the procedure for soil sampling, the meaning of nutrient names such as nitrogen and potassium (they were more familiar with fertilizer names such as urea), and the specific consequences of over and under fertilizing. This was true of both male and female farmers, however males tended to be more interested in learning and acquiring such information, if they thought that it would in general be beneficial to the productivity and growth of their farms. This was corroborated by the independent revelation of gender norms within agricultural families. According the majority of candidates interviewed, women tended to be responsible for day to day field tasks such as weeding, planting, and ploughing, while men spent less of their time actually in the field and more time managing larger level operations such as purchasing inputs (seeds, fertilizers, etc.) and selling produce. Soil testing, therefore, was likely to be a male-focused problem area.

25

Figure 3.3 (Left) Group interview with women in Hubli region, Karnataka, (Right) Interview with women at their home in Dehradun region, Uttarakhand

2. Krishi Vigyan Kendra’s (KVKs)

KVKs are front-line agricultural extension centers financed by the Indian Council of Agricultural Research (ICAR). Among other functions, KVKs are mandated to organize and conduct training courses and educational programs for farmers in their region, demonstrate the latest agricultural technologies to farmers as well as extension workers, and perform analyses of soil samples sent to them from farmers (ICAR, 2010). KVKs in each district are equipped with fully functional laboratories to perform in depth analyses of soil, measuring macro nutrient concentrations, micro nutrient concentrations, electrical conductivity, moisture content, and organic carbon content of soil samples. A total of 642 KVKs are operational in the country (ICAR, 2010), which is equivalent to 1 lab for every 216,000 operational holdings. This under capacity for soil testing is partly to blame for the low fraction of farmers who test their soil. Additionally, the performance and success of KVKs varies widely depending on location, management, and business model, and have recently come under great scrutiny (Yadav, 2014). We visited labs and interviewed scientists at a total of three KVKs across India – two in the state of Karnataka, and one in Uttarakhand. Only one of those appeared to be functioning at full capacity, and provided us with a view of what a KVK that embodied the true spirit of the mandate given to them could achieve. This KVK was very well maintained, and appeared to be managed in a transparent manner. A number of farmer training programs were in session when we visited, and there appeared to be good participation from farmers as well as staff. We also observed a number of test plots in which various hybrid seeds were being tested for performance in arid conditions. Part of the funding for this particular center was from the government, with the day to day management being outsourced to a private company. This public-private partnership model appears to be a very successful one for KVKs in general – it ensures a base level of competition and accountability, while retaining the benefit of government funding, standardization of procedure, and outreach. A number of important takeaways emerged from interviews with a range of stakeholders in this progressive KVK, and these are listed below: 1. The process of behavioral change in agricultural communities take approximately 3 years. New technological solutions should start with liberal farmers, make use of demonstrations as a tool to encourage adoption, and involve several follow up visits to the field.

26

2. Incorporating a field for “suggested crops” could be very helpful as part of recommendation system for farmers. 3. Other pressing concerns such as weather, water availability, and geography often trumps soil health, therefore reducing the urgency for awareness building in this space. It is therefore important to lower the economic and social barriers to adoption as much as possible if a solution in this field is to be successful. 4. Government subsidies on urea are a likely cause of over-fertilization and imbalanced NPK ratios, which is a difficult problem to address. This is just one of the geopolitical and economic factors that come into play; technological solutions need to leverage these factors or find ways to circumvent them. 5. Organic fertilizer recommendations would be valuable for farmers, or perhaps a combination of organic and inorganic, depending on the appropriateness to a specific region. 6. All new technological solutions need to be accompanied with educational programs to train users in the correct way to use them.

Our observations in the other more poorly maintained KVK labs indicated that in general, soil labs were understaffed, disorganized, and inefficient. We observed over-reporting of monthly test numbers, which we struggled to corroborate given the scenario observed in the lab. While two samples are not enough to draw broad conclusion, our interactions helped us appreciate why farmers strongly advocated for an alternative solution to the problem of poor soil health.

3. Corporate Farming Entities (Sugar cane mill)

Corporate farming entities are a common phenomenon in India. These aggregators are most frequently seen where there is a market for high value and high volume commodities such as sugar, wheat, and rice. Specific employment terms vary across companies, but in general, farmers are hired on contract to grow a particular crop (such as sugar cane) on their land during certain times of the year, and to sell 100% of their produce to the aggregator (the sugar mill) at a fixed price. The sugar mill acts as an assured source of income for the farmer, as well as a source of information on best practices for crop growth. This aggregator model was of interest since it represented a common alignment of incentives towards improving soil health and productivity, and therefore presented a potentially efficient platform for outreach to farmers. We visited and interviewed stakeholders at Godavari Biorefineries Ltd., a global agribusiness and bio-energy corporation in Karnataka. The company is one of the largest producers of sugar in the world, the leading manufacturer of sugar in India. They produce sugar, ethanol, bio-fertilizer, and power, which are all products generated from the raw material sugar cane. In Karnataka, Godavari engages with over 30,000 farmers in the area. Interviews with scientists at the sugar mill revealed that while traditionally there has been a lot of focus on seed quality and hybrid plant variety, in recent years they have also leveraged the tools of information and communications technology and the Global Positioning System to record the land holdings, some agricultural practices, and soil testing results of each farm. The sugar mill performs in house research and development of new high yielding varieties of sugar cane, which they test on the numerous plots of land acquired close to their campus specifically for testing. However, they rely on subsidizing farmers for testing their soil via the government labs. The situation highlights that such a corporate farming entity could potentially benefit from a point-of-use soil testing solution by reducing dependency on government laboratories, performing timely testing, and providing

27 soil related recommendations via their internal extension network.

Figure 3.4 Sugar processing plant at Godavari Biorefineries Ltd., Karnataka

4. Research Institutions

There are a large number of agricultural research facilities in India, both government and private that conduct huge volumes of research and produce numerous publications on a variety of topics of use to farmers – crop varieties, seed varieties, agricultural techniques, pest and disease, etc. Our extensive interactions with farmers in both Hubli and Dehradun regions of India revealed that in general, although such large volumes of useful information exist from credible and reliable scientific sources, farmers often rely on rules of thumb and traditional agriculture practices passed on to them from their forefathers. Even those farmers that expressed interest in modern practices and techniques were frequently unaware of where to access information that would be beneficial to them. The following is a synthesis of findings from interviews with a range of private and government research institutions. Indian Agricultural Research Institute (IARI) IARI is a national multi-disciplinary research institution in India that is over 100 years old. They perform research in the areas of Agriculture, Cattle Breeding, Chemistry, Economic Botany, Mycology, and Bacteriology. IARI’s mandate is very much in agreement with the goals of this research project, and as such, are listed below (IARI, 2007). 1. To conduct basic and strategic research with a view to understanding the processes, in all their complexity and to undertake need based research that leads to crop improvement and sustained agricultural productivity in harmony with environment. 2. To serve as a center for academic excellence in the area of post-graduate and human resources development in agricultural sciences. 3. To provide national leadership in agricultural research, extension and technology assessment and transfer by developing new concepts and approaches and serving as a national referral point for quality and standards. 4. To develop information systems add value to information, share the information nationally and internationally and serve as a national agricultural library and database. IARI has developed a mobile soil testing unit that was in large part, an inspiration for our solution. While the unit, which costs approximately $600, is far more cost effective than a full scale laboratory, it is not cheap enough to be affordable by a single farmer, or even a group of

28 farmers. The device is small enough that it can be transported directly to a farmer’s location, which goes a long way in addressing the issue of time delays between sending a sample for testing and receiving a recommendation report. The device comes with a range of chemical reagents for testing a range of nutrients, as well an as inbuilt shaker necessary for the chemical process. The disadvantages of the mobile unit lie in the need for a qualified technician to operate it, the long wait time for a result (~40 minutes), and the minimalism of the recommendations generated. Our discussion with various researchers at IARI was therefore focused on the most important aspects of a potentially new solution in this area that would improve on the mobile soil testing kit that they had created. Their experience corroborated our findings that the point- of-use aspect of the system was critical to generating trust and therefore adoption of the solution. Additionally, interviewees were in favor of a solution that would be affordable enough and easy to use for a farmer on his own, if such a thing were feasible.

Takeaway: The point-of-use aspect of the device and system is a key value add.

International Plant Nutrition Institute (IPNI) IPNI is a global non-profit organization that runs initiatives and manages programs to address growing global needs for food, fuel, fiber, and feed. Their mission is to “develop and promote scientific information about the responsible management of plant nutrition for the benefit of the human family” (IPNI, 2016). The following information is from an interview with Dr. Kaushik Majumdar, the director of IPNI’s South Asia Program.

IPNI has approximately 14 programs worldwide. They work with state and national agricultural universities and international organizations to increase the productivity and profitability of farms while minimizing environmental impact. IPNI has developed a proprietary software called Nutrient Expert intended to help farmers make informed fertilization decisions on the basis of plant symptoms. This is an interesting complementary product to our proposed solution that bypasses the need for a soil test and performs only symptom-based diagnostics. Their software product is currently used by ITC eChoupal and Tata Consultancy Services mKrishi, which are both large scale CSR initiatives to assist rural Indian farmers with soil-health related decisions. Additionally, they have developed a system for sending text messages to farmers that are populated with generalized state level recommendations. This is accompanied by a call center setup where farmers can call in and get information about different types of fertilizers and how to use them. This setup describes a promising route for commercialization of a low cost point-of- use soil testing sensor with an accompanying recommendation system, as this addresses a currently unfilled niche within this system, while complementing their business model instead of competing with it.

Another important lesson from IPNI is the consideration of farmer typology when generating strategies to address nutrient management in farms. An IPNI study on typology revealed that the use of fertilizers and other inputs used by farmers, is strongly influenced by the resource endowment of a farmer, as well as a combination of other socio-economic determinants. Therefore, categorizing farmers on the basis of various socio-economic parameters allows for the design of realistic and effective intervention strategies. This typology-based approach to solution design is highly relevant to this project, and is discussed in more detail in Chapter 4.

29

Takeaway: A typology based approach to categorizing farmers will allow for the design of an effective and successful technological intervention.

Indian Council of Agricultural Research (ICAR) The Indian Council of Agricultural Research is an autonomous organization under the Department of Agricultural Research and Education (DARE) of the Government of India’s Ministry of Agriculture and Farmers Welfare. There are over 100 ICAR institutes and 71 agricultural universities spread across the country, making it one of the largest national agricultural systems in the world. ICAR’s mandate is to coordinate, guide and manage research and education in agriculture in India. The following information is from an interview with scientists at the ICAR Soil Conservation Center near Dehradun, Uttarakhand.

Scientists at ICAR that we interviewed were in strong agreement that the wealth of information available with state agricultural universities was a great resource for the generation of recommendations. They further agreed that the biggest challenge impeding a successful soil testing technology was a lack of knowledge among farmers regarding the proper techniques for soil collection and fertilizer management. Although the management of KVKs fall under the mandate of ICAR, interviewees’ impression of KVK soil labs in India was poor, as was their impression of mobile soil testing systems that exist in the market without customized recommendations. They also offered additional suggestions for expansion of a testing solution to materials such as manure and cow dung, which are increasingly gaining popularity as fertilization materials, as the demand for organic produce rises.

MS Swaminathan Research Foundation (MSSRF) MSSRF is a non-profit trust foundation that aims to “accelerate [the] use of modern science for agricultural and rural development for development and dissemination of technology to improve lives and livelihoods of tribal and rural communities” (MSSRF, 2014). The foundation runs a number of programs that explore ways of adopting science and technology to address practical problems faced by rural populations in agriculture, food and nutrition. The following information is from an interview with Dr. Ajay Parida, Executive Director of MSSRF.

The interviewee listed the following five factors as the biggest challenges in agriculture today: Uncertainty in rainfall patterns, post-harvest losses, a need for food processing technologies, a lack of mechanization, and the migration of youth away from farming. He further emphasized that these problems do not exist in isolation, rather they interact and compound in intensity. For instance, the migration of youth away from agriculture has resulted in a dearth of farm labor resources, thereby increasing the need for mechanization in agriculture. He highlighted information dissemination as a key way to sustain interest in the field of agriculture, citing the Indian Meteorological Department’s initiative to disseminate weather alerts via text message as a positive success. Additionally, he mentioned that farmers tend to share information between themselves, and that members of farming communities tended to be highly inter-dependent and social.

3.2.2.2 Interactive Workshops

A series of design workshops were conducted to further engage directly with the end user in an interactive, group setting. These workshops were conducted in January and August of 2015, with

30 the help and coordination of the Deshpande Foundation and The Himmothan Society. Workshop Design Principles In planning the execution of the workshop, the following design principles were taken into account:

[A] Tone The tone of the workshop first and foremost had to be respectful and amicable. In order to facilitate open and honest discussion, it was imperative that the participants felt comfortable.

For all workshops, a local leader within the NGO was deemed the workshop coordinator given their previous rapport with the community, ability speak the local dialect, and understanding of local culture, tradition, gender roles, and social norms. This coordinator was responsible for all explanations of the activities, encouraging positive group morale, and ensuring that everyone was engaged and involved.

At the start of each workshop, locally sourced popular music was playing in the room as the users walked in to immediately create a positive atmosphere. In order create a sense of community and break down social barriers, icebreakers were subsequently conducted, wherein all workshop participants and coordinators shared their favorite food or Bollywood movie. To further ensure minimal hierarchy and social cohesion, all workshop coordinators were asked to dress modestly and appropriately.

Finally, whenever possible, it was made clear that the mission of the workshop was to help them become better farmers. By reminding the participants of the mission and creating a positive and cordial tone, we were able to extract not only the most information from the farmers, but also their honest feedback.

[B] Group size and characteristics A challenge with any group activity is determining the group size. For this research activity in particular, it was essential that we chose an appropriate number of participants so as to balance the benefits of group discussion and inter-personal interaction with the disadvantages of excessive group size. Too large a group could become chaotic, loud, and a logistical challenge. Through practice sessions, it was determined that the best ratio of participants to workshop coordinators was 5:1, such that each coordinator could be in charge of a group of five, along with the help of a translator. As such, with two to three main workshop coordinators during each trip, it was communicated to the local NGO prior to each workshop that a group size of 10-15 participants was ideal.

In addition to group size, we made it an intentional effort to plan productive groups. Prior to each workshop, we communicated with the local NGO of interest to incorporate users across different demographic spectra – across age, gender, landholdings, educational background, and family size. We also asked the NGO to invite users that represented both “extremes and mainstreams,” so as to not only ensure a vibrant discussion amongst participants with differing perspectives, but also to learn from extreme behaviors or views that challenge the status quo of smallholdings farming.

31

[C] Establishing a shared reference frame In order to extract accurate information from all workshop activities, it was essential that all participants share the same reference frame. Such a reference frame included not only the details of the activity of interest – i.e. whether it was group or individual work, how much time the activity would last, etcetera – but also what knowledge would be required in order to perform an activity. If, for example, the activity required users to draw on previous knowledge of other analogous agricultural devices, the workshop coordinators or translators would ensure that all users were familiar with such a device and could articulate in their own words what they were envisioning. If there was a word or phrase that was confusing to the participants, the workshop was designed with frequent call and response checkpoints to ensure that all users were on the same reference frame. By avoiding assumptions of previous knowledge and maintaining frequent feedback mechanisms between the users and the workshop coordinators, we were able to encourage empowered participation.

[D] Making ideas visible and tangible An important guiding principle for our workshops was making ideas tangible rather than abstract. From initial interactions with farmers, it became clear early on that they did not respond well to abstract and hypothetical scenarios, but rather preferred concrete examples and questions. With this feedback in mind, we designed many of the workshop activities to target a specific critical question of interest. In the case of questions that did try to engage more hypothetical scenarios (“what would your ideal soil diagnostic device be?”), we encouraged farmers to draw on concrete examples or experiences within their own lives to better perform the activities (“what is your favorite agricultural device that you own and why?”).

[E] Being cognizant and proactive with respect to social and cultural norms As outsiders, it was essential for us to try to be proactive in addressing social and cultural norms. On our end, this manifested itself in many ways. We dressed accordingly. We addressed those older than us with a respectful suffix, “ji.” We thanked participants in a traditional fashion with our hands clasped together. It was anticipated that we would not fully be able to learn all the social norms, but by studying them, asking our NGO partners, and being cognizant of them on the ground, we were able to assimilate as best as possible.

The importance of these social and cultural norms was further amplified in the rural agricultural communities from which participants came. In these communities, social hierarchy is very much built upon demographic traits such as age, gender, landholdings, literacy, and education level. Thus, in order to achieve a successful workshop, we tried to be proactive in foreseeing such issues by encouraging self-selection of groups rather than pre-selection of groups before demographic data had been collected. In addition, it was made very clear to the main workshop coordinator encourage all voices to be taken into account, and that no voice should overpower another. By asking the main workshop coordinator and the translators to be vigilant in ensuring that social or cultural norms do not compromise the quantity or quality of feedback received from any individual.

Workshop Activities With the above principles as the fundamental guideline, workshop activities were designed with the goal of understanding not only the users’ main pain points and needs, but also to gauge feedback on prototype designs. Through these workshops, the user’s voice was incorporated throughout, feeding an iterative process of ideation, modeling, and user testing.

32

[A] Workshops Round 1 The first set of workshops, which took place in January of 2015 in Hubli, Karnataka were focused on gauging the pain points of current soil testing methodologies, and determining the main user needs and, correspondingly, required product attributes for a new soil testing system.

Workshops consisted of several interactive activities conducted over a four-hour period. Over the course of three separate workshops, variations of specific activities were introduced to simultaneously study the effect of different activity executions on the quality of feedback received. Each workshop was organized into six different exercises, further described below. The order of the exercises was intentional. At the onset of the workshop, we wanted to prime the group with more casual discussions that users would find easy to speak about. The casual, communal nature of this section of the workshop was intended to establish a shared sense of purpose across participants and coordinators alike.

The most important exercises were at the latter end of the workshop, at a time when users would feel not only most comfortable with us, but also have enough contextual knowledge and discussion to provide the highest quality feedback. In addition to importance, the chronology of the workshop also matched that of the soil testing process itself – thereby allowing users to maintain a shared analogy with respect to their own soil testing experiences.

1. Introductions and project purpose At the start of the workshop, we introduced ourselves and gave a short review of the project and workshop. Intentionally, the project overview remained vague so as not to limit the scope conversation. We were still in the framing phase at this point, and it was essential to remain broad in scope so as not to miss any potentially promising leads.

The workshop was specifically attributed to the MIT Tata Center as well as the Desphande Foundation, so as to separate the workshop from ourselves, and make the participants less likely to avoid critical feedback in hopes of not personally offending any of the coordinators. Finally, the mission of the workshop was emphasized to be beneficial for the participants, that the only right answer is an honest answer, and that ultimately we were there to learn from them and not the other way around. In this way, the notion of hierarchy was minimized – bringing coordinators and participants at the same level.

Following the project purpose discussion, everyone shared their respective icebreakers – specifically what their name was, what crop they grow, and what their favorite Bollywood movie was. Ice breakers helped lighten the mood and make participants more comfortable.

2. Pain points The first activity was devoted to helping getting participants in a mindset conducive to discussing soil testing and brainstorming solutions. Before any ideas could be discussed, it was essential to discuss in detail the problem at hand. In an open-discussion format, participants first were asked to share the difficulties they face in general as farmers, and second, to ask them to narrow down those problems with respect to soil health. As farmers were speaking of their pain points, translators and workshop coordinators were recording audio and taking down notes to ensure that these pain points could be converted into user needs for later stages in the design process. In addition to concerns about soil health, the following is a summary of the main pain points identified:

33

1) Water: Regardless of soil health or fertilizer purchased, plants cannot grow without adequate water. The majority of crops in the region require irrigation (rice, sugarcane, chilly, groundnut) but water pumps and systems are power intensive and expensive. Thus, users must rely on rain-fed irrigation of their crops. Rainfall is unpredictable, and since droughts have intensified over the last decade, users find it difficult to navigate risk given the unpredictability of climate patterns. 2) Lower yields over time: Across the board, farmers reported crop yields worsening over the past few years, however no specific causal factors were identified. 3) Labor: The unavailability of labor in agriculture was a frequently cited issue, partly due to the migration of youth in agriculture to cities for work. The majority of farmers interviewed mentioned labor intensive tasks such as sowing and harvesting as a problem. 4) Fragmented landholdings: Over time, the average size of land holdings in India has been falling, due to fragmentation as land owners divide up their properties between their children. Fragmentation raises issues such as difficulty in scaling up production techniques, and difficulty managing land fragments in locations far from each other.

Takeaway: While soil health remained a serious pain point for farmers, issues related to water and drought were most concerning to them.

3. Product Attribute Exercise The product attribute exercise was the first structured element of the workshop. We conducted this exercise in three different iterations for different workshops but the goal was the same: to have users identify a hierarchy of product attributes such as accuracy and cost. At the time, our soil testing technology did not yet have defined technical specifications in terms of accuracy, longevity, and cost among others, therefore these attributes would serve as the foundation upon which all technical and systems solutions would be built.

Given that soil diagnostic technologies were not fully available or used by the participants, we first asked users to think about their favorite products in their homes or on their farm, share aloud why the valued it, and correspondingly, what attribute they found most valuable. While it was known that product attributes vary across products, we were most interested with participants’ rationale as to what they value in products and why. Three variations of this exercise are described below:

1) Product attribute cards: this exercise involved users ranking a set of pre-made pictographic cards depicting specific product attributes. The visual aspect of these cards was made so as to accommodate for illiteracy amongst participants as well as leverage visual imagery to reinforce the meaning of the attribute to the users. For the exercises, participants were split up into groups of five and each given a set of five cards– depicting attributes such as cost, durability, reusability, time required, and labor intensiveness. Participants had five minutes to review the cards and ask the translators any questions they had about any of the card meanings or ambiguities. Then, in iterations of 30 seconds, the users were asked to get rid of the card they least cared about and put it in a bin in the center of the group. Translators and coordinators collected the bins after each round and ranked the cards 5-1 respectively by round, with 5 signifying least desirable attribute. The 30 second time period was intentional so as to ensure users acted on intuition and gut feel rather than logical cognitive processes.

34

Product Attribute Ranking 3.5 3 2.5 2 1.5 1

Attribute Ranking 0.5 0 Durability Labor Time Cost Reusability Product Attribute

Figure 3.5 Product attribute rankings from Workshop Round 1

Figure 3.5 shows the resulting product attribute rankings of farmers from the first round of workshops. A higher average ranking implies a lower value for that attribute. The data shows that users value product attributes like durability and reusability of a product, and are willing to trade off cost for these attributes. This finding was somewhat unexpected, given that the common approach to designing for resource-poor communities involves minimizing cost as the primary concern. The important lesson from this finding was to not underestimate user’s expectations, and therefore avoid designing an inappropriate solution that would compromise quality for cost.

Takeaway: Users are willing to trade off cost for product attributes related specifically to longevity, durability, and reliability.

Figure 3.6 Product attribute list in Kannada and English

2) Product attribute list: In this exercise, the attributes were written on the board in English and in Kannada with their respective definitions. The moderator discussed each of the definitions one by one so as to reinforce a shared reference frame. Similarly, groups were divided up into five participants and users were to rank each attribute of either “low,” “medium,” or “high importance” and describe to the translator their rationale.

35

3) Product attribute tradeoffs: This iteration was designed to gauge the types of tradeoffs participants make when purchasing an agricultural product. Tradeoff pairs such as a cost versus durability and accuracy versus simplicity were introduced in the context of a hypothetical situation: i.e. you are choosing between two tractors, a more expensive one with a brand you trust or a less expensive one of a knockoff brand: which do you chose and why? Translators then relayed the information of the preferred attribute and rationale to the note takers in each group, who marked the preferred attribute a 1 and the other a 0.

4. Soil Collection Methods The purpose of this exercise was to take the users through the chronology of the soil testing process, starting from soil collection. Three types of prototypes were prepared, and three methods of presentation were executed for the three respective workshops, described below:

1) Storyboards: A series of images were prepared depicting the user steps of the soil collection process. The images were projected onto a large screen in the center of the room, and the moderator walked through the storyboard one by one, explaining the step, and opening up the room for group discussion. 2) Coordinator roleplay with prototypes: The moderator and the main workshop coordinators roleplayed the soil collection process with foam, cardboard, and paper props. The moderator described each point of the soil collection process, and paused afterwards for a group discussion. 3) Participant role-play with prototypes: Participants themselves role-played with the props outdoors. For example, for the method that involved digging a hole in the ground, a participant volunteer actually dug a hole out in the garden with a shovel provided. In this way, the users were able to viscerally engage and experience a mock version of the soil collection process, and therefore be able to comment on their feelings and pain points more effectively.

Figure 3.7 (Left) Storyboards depicting soil collection methods, (Right) Farmers participating in role play exercise with outdoor props

36

The overwhelming consensus among farmers was that soil collection was not a difficult process at all. This came as a surprise to us, given that the task of digging 15 holes is labor intensive and difficult. However, given the laborious nature of their day to day work, farmers were able to quite easily do this, indicating that the soil sample collection part of the cycle was not of great concern.

Takeaway: Soil collection method was not a concern, given the laborious nature of farmer’s day-to-day work. However, quality control may remain an issue, therefore an educational information on proper soil collection techniques should be distributed along with the device.

5. Soil Test Cards For this exercise, we created a variety of looks-like paper prototypes that demonstrated different colorimetric techniques of soil detection and presentation of the result. Given that a colorimetric- based approach was the most likely form of transduction mechanism at this stage of the project, we specifically wanted to focus on whether users found this solution intuitive to use. The paper mockups created showed colorimetric readouts of N, P, K, and pH values in the form of blocks, circles, and distance-based ‘lines’, shown in figure 3.9. Participants were divided into subgroups of five and asked to decode a series of fake readouts, depicting different concentrations of soil nutrients. After evaluating the soil nutrient concentrations, participants were asked to evaluate the difficulty of decoding it before moving to the next card. Workshop coordinators observed users as they worked with each other to decode the card, and compared their qualitative observations of user difficulty with those reported.

Figure 3.8 Female participants working together to decode colorimetric sensor prototypes

37

Figure 3.9 (Right top) Paper mockups of colorimetric soil testing sensors, (Right bottom) Soil card to interpret color-based sensor reading (Left) Soil card to interpret distance-based sensor reading

35% of participants reported that they preferred the sensor with maximum colored area (four quadrants colored for each nutrient), since it was easier to read. The remaining sensor mockups received about equal rankings for preference. Notably, there was some level of ambiguity in reading and interpreting color based results, and participants within a group often disagreed about color-matching.

Takeaway: Although error rate and self-reported ease were satisfactory; qualitatively, users found colorimetric methods to be ambiguous, subjective, ambient light-dependent, and requiring extra instruction.

6. Open Discussion and Closing The final part of the workshop consisted of an open discussion as to what activities they found particularly engaging or useful, and what suggestions they had for future workshop iterations. It also consisted of demographic data collection for which we used to match to individual qualitative and quantitative results from the workshop. Lessons from this discussion were applied to subsequent workshops in order to improve the quality of interaction.

38

[B] Workshops Round 2

The second set of workshops occurred in August of 2015 and took place in two locations: Hubli, Karnataka and Dehradun, Uttarakhand. In expanding to a second locations, we hoped to garner a greater diversity of user feedback and determine whether our soil diagnostic system would be appropriate for different types of agricultural communities.

The main purpose of these workshops were to generate user feedback on a set of ion selective electrode based ‘user experience’ dummy prototypes, and to determine the preferred input and output communication mechanisms for the recommendation system. These mechanisms refer respectively to the method in which users send their soil diagnostic result to a central database and subsequently the method in which they receive their soil fertilizer recommendation result.

A total of four workshops were conducted, each approximately 3 hours long. The introductions, discussion of pain points, and conclusions in this workshop were the same as previously described, with the central activities varying. These are described below.

1. Demonstrative Video Prior to the workshops, we created a short two-minute demo video highlighting the various steps of the user experience. This video started at a stage after which the user had a composite soil sample from the farm, and subsequently highlighted (i) adding an extractant solution to release anions from the soil (ii) dipping the ion selective electrode strips into the aqueous soil solution (iii) waiting for the sensor to calibrate (iv) interpreting the soil diagnostic result from the reader (v) reading the crop code from the back of the device and (vi) creating alphanumeric codes representing the respective N, P, K, pH values and crop type of the user, to be used as an input to the recommendation system later on. The moderator allowed users to watch the video on their own once, and then played the video again, pausing at each step in the process to explain and open it up to group discussion. The video primed users for the next exercise, in which users interacted with a ‘works-like’ prototype of the device.

2. Product Demonstration A set of paper-based prototypes were created with an Arduino microcontroller system acting as the ion selective electrode strips and reader respectively. Prototypes were created to simulate a fake test result that mimicked the actual functionality of the sensor. In each workshop, the soil testing process was demonstrated to users using a fake soil sample, and fake extractant solution. Users were instructed to provide feedback to translators and note takers on their perception of ease of each step. User feedback included comments on the size of the device, location and structure of information printed on the device (N, P, K, pH), and brightness of LED display, all of which directly fed into design specifications for the device.

39

Figure 3.10 Demonstration of sensor functionality using works-like prototypes

3. Input/Output Mechanism Roleplay Once users had the chance to interact with the dummy works-like prototype, they each had a pre- coded alphanumeric code representing their respective N, P, K, pH values and crop type. Using that alphanumeric code as an input, we role-played three different methods for sending that information to the central database: via postal service, text, and through a website. The role play involved acting out the specific interaction method under discussion (such as looking at the alphanumeric code, and texting it out to a number) accompanied by a verbal description of each method by the translator. After each role play, the moderator facilitated a group discussion on what users found appealing or difficult. At the end of the exercise, users ranked their favorite input method in decreasing order of preference.

The same activity was repeated for the soil fertilizer recommendation output. We role-played three methods: postal service, voice message received on the phone, and an interactive web based GUI. Feedback and rankings were similarly collected for the output mechanism. Results of this exercise are shown in figure 3.10 below, in which a lower ranking is indicative of a higher preference for that mode. Based on the role play and verbal description of the interaction mechanism, users expressed a preference for the text message as an input mechanism, and the voice recording as the preferred format in which to receive a recommendation from the system. In general, older farmers preferred text based recommendations that they could read at leisure later, while the younger demographic was more comfortable with using text messages.

40

Input Mode Preference Output Mode Preference 3 3 2.5 2.5 2 2

1.5 1.5 Ranking 1 Ranking 1 0.5 0.5 0 0 Postal Text Website Postal Voice Website service message service message

Figure 3.11 Aggregate rankings of Input (left) and Output (right) mechanisms of interaction with soil health recommendation database

Takeaway: Users preferred to send soil health results to the database via a text message, and receive recommendations via voice recordings.

4. Interpretation Exercise In order to determine the relative efficacy of text or voice as a mode for relaying soil fertilizer recommendations, this activity asked participants to either read a text-based recommendation or listen to a voice-base recorded version. A group discussion followed, asking users specific information about the soil recommendation they had just read or heard, such as “What was your Nitrate concentration?”, and “How much urea should you apply on your farm?”. In general, farmers were seen to perform better at interpreting written recommendations, as opposed to being able to follow voice-based instructions. Notably, although many users ranked a high preference for the voice recording mechanism in the previous exercise, the majority of them changed their mind after experiencing the voice recording first hand.

Figure 3.12 Farmers in Hubli, Karnataka reading mock recommendation cards

41

Takeaway: Farmers found it much easier to comprehend recommendations provided to them in text form, as opposed to hearing detailed instructions delivered to them in a voice recording. This is in contrast to preferences expressed without a demonstration.

5. Cost Threshold In the final exercise, users specified a cost threshold for the system. By this stage in the workshop, users were able to envision the whole system including the core technology and recommendation system, and were therefore able to gauge the value of such a soil diagnostic solution in their lives. Given that the technology consisted of disposable diagnostic ion selective electrode strips and a fixed reader, we asked users to independently price their value for each component. On a scale of 1-1000 Indian Rupees, users circled where they felt each respective component should be priced. In order to generate as realistic a price point as possible, we emphasized that there was no right or wrong answer to this question. Further, participants were informed that this exercise was voluntary, and that they need not specify a price if they were not comfortable doing so.

Takeaway: Users specified a cost threshold of approximately ₹500-₹600 for the fixed cost of the reader, with a variable cost of ₹15-₹60 for each sensor strip.

42

3.2.2.3 Product Contract

This section is a distillation of lessons from interviews and workshops described in the previous section into specific attributes of the product and system. These attributes, summarized in table 3.1 below, directly translate into design guidelines for the product as well as the system, which determine the research direction and ultimate system design. Although technical constraints that resulted from this exercise did not directly impact system considerations, they are included here to provide a complete picture of the end to end system.

Target user: Rural small landholding farmers with adequate access to water resources and fertilizer distributers Use case: Twice a year before planting in the Kharif and Rabi seasons, users take a composite soil sample represenative of their farm, perform a soil test, and receive a customized soil fertilizer recommendation Product Contract User data User need Heirarchy Attribute Technical constraint System constraint Provides soil fertilizer Can accuractely recommendation granularity (i.e. "Accuracy is the most measure Measures ppm values low, medium, high) equal to or important aspect of a macronutrients and +/- 10% with respect to greater than current soil testing quality soil test" pH in soil High Accuracy a KVK soil test facilities "The current system is too slow, if I were able to get a Accommodates different typologies result on my own fairly Can perform test on < 10 distinct steps from off farmers in terms of education easily, I would do it" their own High Usability soil collection to result level, literacy, language "I had to travel 50km to the Systems enables real-time local KVK and I never got Can be used on the recommendation generation on the a soil test result back" farm High Point of use Handheld operation field "We need more information about how much fertilizer to use, Can provide Provides customized soil fertilizer what crops to grow, and actionable feedback recommendation with additional how to use inputs to improve soil information on ideal crop type for effectively" health High Actionabilty N/A soil and best practices "If the task takes me too much time I will not care Can provide soil Time until anymore" test results quickly Medium results < 10 minutes "I don't trust anyone but myself, and so I want to do Can be used by an Requires no equipment external to farm, minimal expertise or the test myself" individual farmer Medium Independence training needed "I don't mind paying if it's Can easily be good quality" afforded Medium Affordability < 1000 INR MRP "I can afford to buy this Can last a long Recommendation generation engine only once so longevity is period of time over amenable to changes updates over important" multiple uses Medium Longevity Lasts > 100 soil tests time "We are used to spending Can provide result many hours a day with minimal Labor working in the field" physical activity Low intensiveness Minimal physical work to procure soil sample

Table 3.1 Product contract describing attributes that determine technical and system design constraints

43

3.3 Discussion of Research Question

As discussed in Chapter 2, this thesis is guided by the following question:

“What constitutes an actionable information system for resource-poor Indian farmers?”

Actionability is that property that enables the farmer to interpret and act upon soil health recommendations 100% of the time. The decision to focus on the problem of actionability in resource-poor settings is a direct result of the exploration of the problem area described in this chapter. The interactive workshop exercises emphasized the need for a complete support system accompanying a device in order for users to be able to take positive action on the basis of information provided to them. User interviews corroborated this, and additionally emphasized the need for individualized recommendations. Expert interviews from scientists at IPNI, IARI and others served as a reassurance of the direction of the research, both in terms of focusing on the right problems, and asking the right questions. The product contract described previously is a distillation of these lessons, and is what ultimately determined the chosen direction for this research, as well as the specific research question to be tackled.

44

Chapter 4 Actionability Experiment: Data and Analysis

This chapter provides a detailed description of the field experiment conducted in Ballarawad village in North Karnataka in January 2016, the purpose of which was to formulate a numerical index measuring the actionability of simulated soil health recommendations. This field experiment was conducted in partnership with The Deshpande Foundation, a large NGO in the area working to launch effective and scalable models of development in the surrounding rural communities. The first section describes the motivation behind the design of the experiment. The experimental methodology and setup are then explained, followed by an evaluation of the demographic data collected in the village under study. A detailed analysis of the experimental data is provided, leading to the formulation of an actionability index. The chapter concludes with a summary of lessons from the analysis that shape the design of the recommendation engine.

4.1 Motivation

The purpose of this experiment was to try and elucidate the various factors that affect actionability as it pertains to soil health advisory in a resource-poor setting, via direct interaction with farmers in an entire village. The idea of using the entire population of a representative village for studying actionability is to expose the diversity within a community that is currently ignored by the incumbent soil health advisory. A clearer understanding of such factors would then enable us to formulate a mathematical index: the “actionability index” that accurately captures the relationship among variables that affect a user’s measured ability to take action on soil health recommendations. Specific lessons from an analysis of actionability in a resource-poor setting would then be used to adjust the design of the soil recommendation database described in Chapter 5 to produce soil health recommendations with improved actionability for the user.

4.2 Experimental Design

The experiment was conducted as follows. A village was selected from the locality of the Deshpande Foundation’s (hereafter “DF”) major operations. Through a wide array of leadership programs and projects in these rural communities, DF engages residents and youth in finding solutions to collective problems in agriculture. As a result of their strong presence and vast array of successful projects in the area, DF is a trusted and well-respected brand name. This was important in order for us to generate high willingness to participate in our field experiment. That said, our experimental design ensures that the village we selected is on average representative of an Indian village, and that results are not unduly affected by the presence of DF. Selection of Village Table 4.1 summarizes key demographic information from the 2011 Indian Census for six candidate villages in and around Dharwad district in north Karnataka, where DF operates. Out of the six potential candidates for the study, Kadadhalli and Nagarhalli were clear outliers in that their average literacy was significantly higher than the state and national averages. Gudisagar also had a fairly high literacy in comparison to the state and national averages, so was similarly eliminated. Alagawadi's literacy was far lower than the state and national averages, so would

45 also not have been a good choice as a representative population to study. Of the remaining two, Hallikeri and Ballarawad, the latter was the better choice in terms of total number of households manageable with the resources at hand. The literacy rate of 0.6959 in Ballarawad, being so close to the state and national level averages, establishes to some extent the external validity of results from an experiment on the population of the village.

Village Name Particulars Hallikeri Alagawadi Ballarawad Gudisagar Nagarhalli Kadadhalli District Gadag Dharwad Dharwad Dharwad Dharwad Dharwad Taluka Mundargi Navalgund Navalgund Navalgund Navalgund Navalgund Gram- Panchayat Hallikeri Alagawadi Shisuvinahalli Gudisagar Belahar Gudisagar Total No. of Houses 647 1166 300 427 277 153 Population 2940 5838 1599 2109 1391 825 Child (0-6) 351 732 195 264 148 139 Schedule Caste 495 334 92 134 150 136 Schedule Tribe 262 518 262 4 28 0 Literacy 0.7219 0.653 0.6959 0.7442 0.7812 0.8061 Total Workers 1548 3238 974 1384 753 433 Main Worker 1525 2906 833 1108 458 425 Marginal Worker 23 332 141 276 295 8 Child Sex Ratio 1008 993 1032 977 852 988 Notes: This table compares and contrasts key demographic indicators across six villages in Karnataka. The Karnataka state average literacy in rural areas is 0.6873, while the Indian national average literacy in rural areas is 0.6890 Table 4.1 Indian Census data for candidate villages in Karnataka state

Methodology Demographics were manually collected for each household in the chosen village, Ballarawad, with the assistance of 30 field volunteers from DF. A total of 246 households participated in the study. 21 households were either unavailable or unable to participate. One farmer from each household (the head of the household) was asked to participate in the exercise. The participating households were then divided into three test groups for the experiment: a control group, and two treatment groups. The control group farmers were provided with a basic soil health recommendation modeled on the current format of recommendations provided by government soil testing labs in India. Figure A.1 in appendix A presents a sample recommendation offered to the control group. These reports simply stated the soil test result, and the resulting recommendation stating what remedial action to take. The first treatment group were provided with the control group recommendations augmented with additional information on how to carry out the specific fertilizer treatments recommended. Figure A.2 in appendix A presents a sample recommendation offered to the first treatment group. The second treatment group were provided with the treatment 1 recommendations further augmented with additional information on the rationale behind the recommendation – that is, why it was important to carry

46 out the advised action and the reasoning behind the formulation of the specific fertilizer recommendation. Figure A.3 in appendix A presents a sample recommendation offered to the second treatment group. The task of providing point-of-use recommendations to autonomous farmers is analogous to the trend of providing online point-of-use tools for health diagnostics, such as WebMD and the Mayo Clinic. Moreover, as an intervention in human health is necessarily conservative, it serves as an appropriate starting point for modeling a soil health recommendation system for a resource poor farmer whose risk bearing capacity is similarly low. A typical online diagnostic system has the following components: symptom, diagnosis, what to do, how to do it, what are the risks of not taking action, and when to see the doctor. From this list, we extracted the components what, how, and why, with which to model our experimental recommendations. Test groups for the experiment were therefore designed in this way to stimulate settings in which progressively more information was provided to farmers in each subsequent group, which mimics the way in which health diagnostics provides information to risk adverse seekers of medical advice. Each of the three test groups was further divided into three experimental environments, which dictated the manner in which the farmer interacted with the recommendation provided to him. In the ‘Farmer’ environment, each farmer was asked to read and understand the soil health recommendation on his own. He was asked to take as much time as he needed, but was not allowed to ask anyone for help. He was informed that at the end of the exercise, he would be asked some specific questions pertaining to the recommendation he had just read. In the ‘Farmer+Social Network’ environment, farmers were provided with the recommendation specific to their assigned group, and asked to discuss it with their social network (members of their household, neighbors, friends etc.). Participants were informed that the recommendation report would be left with them for a period of approximately two hours, following which they would be asked some questions. In the ‘Farmer+Entrepreneur’ environment, farmers were assisted by a local “entrepreneur”, in this case, a volunteer from the partner NGO with some agriculture-related experience. During the interaction, farmers were allowed to take as much time as they needed to read and understand the recommendation, while asking for as much help as they needed from the volunteer. They were informed that the exercise would conclude with a question and answer session during which they would not be allowed to ask for assistance. The experiment concluded with a detailed survey of each participating farmer, to collect quantitative and qualitative data measuring indicators that we posit are contributors to actionability of the soil health recommendation provided to him. Sample survey collection sheets are shown in appendix B. Stratification by Farmer Typology Experimental sub-groups were randomized and stratified according to ‘typology’ – a characteristic of a farmer defined on the basis of his education level, and land holding size. Consideration of typology in this experiment was a direct result of an assessment of relevant literature, which indicates that the adoption of new technology in agriculture is severely hampered by a lack of consideration of farmer typology, particularly as regards smallholder farmers in developing countries (Tittonell et al. 2010, Goswami et al. 2014). Typology is a conceptual basis with which to classify a group under study (in this case, smallholder farmers). It captures the main sources of underlying variation in a population, thereby allowing the population to be categorized on the basis of its diversity. Studies show that for precise and

47 effective technological interventions, a study of farmer typology is of practical interest. In addition to aiding improved adoption of new technologies, consideration of farm typology has been shown to help in understanding why a certain technology was adopted or rejected (Tittonell et al. 2010). Relevant bases for typology from literature include size of land holdings, education, dependence on off-farm income, ownership of livestock, diversification of income, growth of cash crops, and food self-sufficiency. Studies on typology use a selection of factors such as those listed, to create groups that embody a combination of factors, instead of considering each factor in isolation where there is a likelihood of overlap. For the purpose of this experiment, we chose education level and land holding size as the basis for typology definition, since these were determined by expert opinion (IPNI and DF) to be the primary factors affecting farmers’ capacity for growth and development in the agricultural domain. India's Census-Literate Population by Educational Attainment uses a measure of class 5 to be the first cutoff for literacy, known as “primary” (Sharma et al., 2008). In the chosen village, the average education level was at a level of 5.05, so grade 5 was used as the cutoff to distinguish between typologies for low and high education. The five major categories of land holding as defined by the Department of Agriculture (DOA) as follows in hectares (ha): marginal (below 1 ha), small (1-2 ha), semi-medium (2-4 ha), medium (4-10 ha), large (10 ha and above) (DOA, 2014). In the chosen village, the average land holding was 6.53 acres, which is equivalent to 2.64 ha. This shows that on average, farmers are small to semi medium in this area. When considering farmers with >0 land holdings (since landless laborers were categorized under a separate typology), the average was found to be 7.92 acres, or 3.21 ha. The chosen cutoff is therefore the midpoint between small and medium farms according to the DOA definition, namely 3 ha, which is equivalent to 7.41 acres. The typology categories were defined as follows:

 Typology 1: Farmers with education >5th grade and land holdings ≥ 3 ha  Typology 2: Farmers with education 5th grade or lower, and land holdings ≥ 3 ha  Typology 3: Farmers with education >5th grade and land holdings < 3 ha  Typology 4: Farmers with education 5th grade or lower, land holdings <3 ha  Typology 5: Landless laborers (land holdings = 0) Based on the notion that farmers belonging to different typologies were likely to behave differently from each other under similar experimental circumstances, it was important for us to consider typology while distributing participants into groups for the experiment, in order to eliminate typology-based bias in our results. Farmers of each typology in the village were therefore evenly and randomly split across each of the nine sub groups in the experiment, creating a random, stratified sample of farmer typology in each sub group, as shown in Table 4.2 below. Imperfect distribution of certain typologies across groups is due to the fact that certain households were unavailable to participate in the exercise on the day of the field experiment, however their demographics had been collected in the first iteration, so they had been considered in the original group split. Additionally, outliers were removed after the original group creation, which also affected the evenness of the distribution of typologies across groups. In spite of these alterations, the typology spread has remained relatively uniform across sub-groups, and the means and variances of various indicators across the groups are comparable, as discussed next.

48

This result indicates that the experimental results described in the following section are not a result of typology-specific bias in the grouping of participants.

Balance across Environments Balance across Test Groups

Typology Typology T1 T2 T3 T4 T5 T1 T2 T3 T4 T5 Control 25 9 27 9 13 Farmer 20 13 31 7 14 Treatment 1 19 13 27 10 14 Farmer + Social Network 23 11 26 10 12 Treatment 2 18 13 27 6 14 Farmer + Entrepreneur 19 11 24 8 15

Table 4.2 Typology counts per Test Group and Environment

4.3 Examination of Data

In the first phase of the experiment, demographics were collected from each of the participating households in the village. From each head of household, the following set of information was collected: name, address and contact number, gender, age, education level completed, land holding size (acre), number of land fragments owned, crops grown, number of people living in the household, other sources of income, whether they followed rain-fed or irrigated farming, or both, annual income, whether they use mobile phones, whether they use the internet, and whether they have ever tested their soil in the past. After being presented with recommendations specific to their test group and environment, farmers were asked to reproduce the soil fertilizer recommendation amounts for Nitrogen, Phosphorous, Potassium, and Gypsum from their respective reports. Farmers were allowed to reference the report while answering this question, but had to answer independently, regardless of environment. In other words, even the farmers who were allowed access to their social network or an entrepreneur while interpreting the recommendation had to work independently at the time of reporting their answer. This result was then subjected to a scoring mechanism for accuracy of interpretation on a scale of 0-4. Each correct answer was given a score of 1. Answers differing slightly from the reported amount but within a +/- 5 kg/acre range were also given a score of 1. If a farmer wrote down the soil test result amount instead of the fertilizer recommendation amount for a particular nutrient (these are related but not the same), he was given a score of 0.5, indicating some level of interpretation but not complete understanding. In a few cases, farmers identified the correct answers but wrote them down in the wrong order (N in place of P for example); these were awarded 0.5 points each. Wrong answers or blank spaces were awarded 0 points each. A score of 4 thus indicates perfect interpretation of the recommendation, and a score of 0 indicates a complete lack of interpretability for that farmer. This score on a scale of 0-4 represents one of the outcome variables of interest in this analysis, ‘interpret’. Additional outcome variables of interest were collected from farmers, including a ranking from 1-10 on the ease of understanding the recommendation (‘ease’), a ranking from 1-10 on the ability to afford the recommended inputs (‘afford’), and a ranking from 1-10 on the ability to acquire the required inputs, in other words, a measure of the availability of inputs (‘acquire’). The sample survey sheet in appendix B provides more insight into the set of quantitative and qualitative data collected from farmers at the end of the exercise.

49

Table 4.3 shows a summary of variables used in this analysis. The education level of farmers was encoded as a number representing which grade of school they successfully completed (1 for first grade, 2 for second grade, etc.). For those farmers with some level of education beyond the 12th grade, their entry was coded as 13. Land fragmentation was posited to be significant to this field experiment as recent studies in India have shown that land fragmentation is positively and significantly associated with inefficient farming (Manjunatha et al. 2013), therefore a variable measuring this was also included in this analysis. Notably, there were two outliers in land holdings, namely 78 and 140 acres respectively that were eliminated in the calculation of these statistics, as well in the subsequent analysis.

Variable Description Mean St. Dev Min Max Categorical variable specifying gender, M gender or F - - - - age Age in years 50.57 12.70 26 89 edu Education level in years 5.29 4.64 0 13 land Land ownership in acres 5.88 6.93 0 36 landfrag Number of fragments of owned land 1.32 1.02 0 5 Number of people supported in the household household 5.62 2.78 1 16 Categorical variable specifying test env environment – ‘Farmer’, ‘Farmer + Social Network’, and ‘Farmer + Entrepreneur’ - - - - Categorical variable specifying testgrp experimental tests groups – ‘Control’, ‘Treatment 1’, and 'Treatment 2’ - - - - Output variable measuring ability to interpret 3.06 1.40 0 4 interpret recommendation Output variable measuring ease of ease 6.14 2.91 0 10 interpretability Output variable measuring farmer's ability acquire 5.81 2.67 0 10 to acquire recommended inputs Output variable measuring farmer's ability afford 5.12 2.42 0 10 to afford recommended inputs

Table 4.3 Summary statistics and variable definitions

Exactly half of the surveyed population (122 individuals out of 244) revealed that they diversified their income from sources other than farming. Some of the listed external sources of income include cattle rearing, dairy, school teaching, employment with the army and other governmental agencies, private business ownership, and other employment in urban areas close by. Mobile phone usage was very high in this village; 87.7% of people surveyed said that they used a mobile phone, even if they didn’t personally own one. Unsurprisingly, internet penetration was low, with only 3.3% of individuals having ever used the internet. Qualitative interviews revealed that internet connectivity in the village was primarily available through the mobile phone network, and not through fixed lines. Only 4 individuals out of the 244 surveyed said that they had ever submitted a soil sample for testing.

50

Figure 4.1 shows summary statistics of key demographics as histograms of age, education level, land holding, and household size distributions in Ballarawad.

Figure 4.1 Graphical summary of demographics of Ballarawad

Table 4.4 shows a balance check of covariates across test groups and environments. The high p- values indicate that the variance in means across groups of these four primary indicators is minimal, thus the groups are balanced. One exception is in the distribution of household sizes across test groups, for which the p-value is 0.0473, indicating that variance test rejects the null hypothesis that the variances in household sizes are equal across different test groups, in favor of the alternative hypothesis that at least one test group has a different variance. This is unlikely to cause problems as household size is posited not to have a significant effect actionability.

Test Group Environment Farmer + Farmer + Treatment Treatment Control p-value Farmer Social Entrepren p-value 1 2 Network eur (1) (2) (3) (4) (5) (6) (7) (8) Age 51.86 51.70 48.00 0.4546 48.72 52.01 51.08 0.6359 Education Level 5.01 5.08 5.83 0.4940 5.14 5.28 5.49 0.7067 Gender (Fraction male) 0.80 0.86 0.90 - 0.84 0.89 0.92 - Land Holding 5.88 6.49 5.22 0.5470 6.04 5.86 5.72 0.8274 Household Size 5.72 5.88 5.24 0.0473" 5.40 5.77 5.71 0.8299 Notes: This table reports the mean of covariates across test groups and environment groups, as well as results of a variance analysis of covariates across test groups and environments. p-values from the variance tests are noted in columns (4) and (8) respectively. Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘"’ 0.1 ‘ ’

Table 4.4 Statistical comparison of demographics across Test Groups and Environments

51

4.4 Components of Actionability

The ultimate goal of this field experiment was to construct an index for actionability on the basis of various independent variables such as education, land holdings, age, gender etc. The index is intended to accurately capture the underlying complexity of the interaction among demographic parameters affecting actionability, as well as the behavioral and social nuances of the user group under study. For the purpose of this experiment, actionability of a soil health recommendation is hypothesized to comprise some combination of the following four components: i. Interpretability: The ability of the farmer to interpret the specific fertilizer quantities recommended (‘interpret’) ii. Ease: The ease with which a farmer is able to interpret and understand the recommendation (‘ease’) iii. Availability: The availability of the recommended fertilizer inputs in terms of access to materials (‘acquire’) iv. Affordability: How affordable the recommended solution is for the farmer (‘afford’) The discussion of how these factors are combined is presented in section 4.6. Recall that questions from the survey were designed to capture each of these factors independently via a user specified rating. ‘Interpret’ represents a score from 0-4 measuring a farmer’s ability to accurately reproduce the fertilizer recommendations provided to him. ‘Ease’ represents a score on a scale of 0 to 10, ranking how easy it was for a farmer to understand the recommendation provided to him, where 0 meant he found it extremely difficult, and 10 indicated that he found it extremely easy to understand. ‘Acquire’ and ‘afford’ represent scores on a scale of 0 to 10 ranking a farmer’s ability to acquire and afford the recommended inputs respectively. Participants were instructed to rank how easy it would be for them to acquire inputs solely in terms of location of store, availability of transportation, and other logistical concerns, in order to isolate this from the economic considerations of the ‘afford’ ranking.

Test Group Environment Farmer + Farmer + Treatment Treatment Control p-value Farmer Social Entrepren p-value 1 2 Network eur (1) (2) (3) (4) (5) (6) (7) (8) Interpret 2.95 3.45 2.75 0.0012** 2.90 3.02 3.27 0.2083 Ease 6.45 6.66 5.41 0.0845 5.58 6.62 6.38 0.1269 Acquire 6.61 5.93 4.83 0.0017** 6.19 5.70 5.49 0.032" Afford 5.63 5.35 4.32 0.1540 5.11 5.69 4.53 0.0149* Notes: This table reports the mean of output variables of interest across test groups and environment groups, as well as results of a variance analysis across test groups and environments. p-values from the variance tests are noted in columns (4) and (8) respectively. Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘"’ 0.1 ‘ ’ Table 4.5 Statistical comparison of output variables across Test Groups and Environments

Table 4.5 contains a preliminary check of the variability of the four output variables of interest across test groups and environment groups that form the nine sub groups of the experiment. Examination of the p-values in column (4) indicate that the mean scores of ‘interpret’ and ‘acquire’

52 vary significantly across the three test groups. A similar examination of p-values in column (8) indicate that the mean scores of ‘acquire and ‘afford’ vary significantly across the three environments. Notably, the variation of all four parameters across typology groups (not shown) are significant at the 0.1% level, indicating that typology based effects in this demographic are significant in their impact on the components of actionability. 4.5 Analysis of factors affecting components of Actionability

We have established that for the purpose of this analysis, Actionability comprises a combination of the following four components: Interpretability, Ease, Availability, and Affordability. The previous section covered a basic analysis of the variability of these components across test groups and environments. This section now dives deeper into the analysis of each of these four components, in order to obtain a richer understanding of the relationship between each component, and the combination of demographic and experimental factors affecting each of them. Given a set of functional relationships that emerge from this analysis, we may glean the specific interaction of the set of demographic and experimental variables on our outcome of interest – Actionability. A preliminary exploratory analysis of variables is conducted using kernel regressions. The resulting plots of variables against the four output variables of interest, ‘interpret’, ‘ease’, ‘acquire’, and ‘afford’ are show in appendix C. These plots describe the relationship of key explanatory variables with each variable of interest. Note that the majority of the relationships described are approximately linear in nature, indicating that OLS regression would be an effective tool to estimate these variables in this data set. Linear regression models are then used to estimate the effects of each component. Model coefficients with different combinations of predictors are reported, to assess the stability of the relationships of dependent variables with individual components. Demographic variables are introduced progressively into models in the following order: education, gender, age, land holding, household, based on the informed notion that these would be important determinants of actionability in this setting, in decreasing order of priority. A final model is fit on all components in which the complete set of demographic and test variables are included to assess significance. An evaluation of these results gives rise to a model equation for each component that consists of demographic and test variables found to be significant to that component. Each model is described in detail in the following four sub sections. 4.5.1 Model 1: ‘Interpret’

The results of a series of ordinary least square regression estimations on interpretability are shown in Table 4.6. Coefficient estimates in the table show that all models indicate that education level and gender are statistically significant predictors of interpretability at the highest significance level. Every additional year of education was seen to improve a farmers interpret score by ~0.07 out of 4 on average, which is small but significant for farmers with no education compared to those that do. The gender effect was much more visible, with male participants scoring ~0.88 out of 4 higher than their female counterparts on average. Additionally, coefficients in column (5) of the table indicate that the treatment 1 group had a positive effect on interpretability, increasing the score on average by 0.46 out of 4. This result is significant at the 1% significance level, indicating that the additional information on how to carry out a

53 recommendation was beneficial to farmers in that group. The treatment 2 group on the other hand, showed no significant positive impact.

OLS Regression of 'interpret' (1) (2) (3) (4) (5) Education Level 0.08283*** 0.06541*** 0.072227*** 0.0693418*** 0.070947*** (0.01872) (0.01900) (0.019551) (0.0200740) (0.019617) Gender (Male) 0.84355*** 0.876498*** 0.8354018*** 0.884075*** (0.24531) (0.245869) (0.2495081) (0.245566) Age 0.009954 0.0075476 0.005192 (0.006977) (0.0071359) (0.007031) Land Holding Size 0.0002627 -0.002632 (0.0133397) (0.013048) Household Size 0.049417 0.041343 (0.0325371) (0.031783) Farmer + Entrepreneur 0.337923" (0.203649) Farmer + Social Network 0.028624 (0.201275) Treatment 1 0.455790* (0.200347) Treatment 2 -0.296218 (0.206217) Intercept 2.6164*** 1.99308*** 1.425617** 1.3180568** 1.273182** (0.13171) (0.22240) (0.455485) (0.4611320) (0.469880) R-squared 0.07488 0.1181 0.1256 0.1346 0.1928 Adjusted R-squared 0.07105 0.1108 0.1146 0.1165 0.1617 F-Statistic 19.59 16.14 11.49 7.406 6.209 p-value 1.46E-05 2.63E-07 4.60E-07 1.78E-06 7.48E-08 Notes: This table reports estimates of the effect of demographic parameters on the ability of farmers to interpret soil health reports. Each column reports coefficients from a regression of the 'interpret' score on the education level and successively added controls. The results in columns (2)–(4) are from models that include dummies for a range of demographic variables. Column (5) shows the result of a full regression model including all variables. The sample size is 244. Standard errors are reported in parentheses. Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘"’ 0.1 ‘ ’

Table 4.6 OLS Regression analysis of ‘interpret’

An ANOVA analysis of the full model shown in column (5) indicate that there is strong evidence that parameters edu, gender, and testgrp have non-zero coefficients in the regression, at a 0.1% significance level. The resulting best fit model equation for interpretability is of the following form: interpret ~ 1 + edu + gender + testgrp (1)

54

4.5.2 Model 2: ‘Ease’

The results of a series of ordinary least square regression estimations on ease are shown in Table 4.7. Coefficient estimates in the table show that all models indicate that education level, gender, and age are statistically significant predictors of ease at the highest significance level.

OLS Regression of 'ease' (1) (2) (3) (4) (5) Education Level 0.31914*** 0.29611*** 0.31495*** 0.30631*** 0.309611*** (0.03466) (0.03561) (0.03646) (0.03747) (0.036279) Gender (Male) 1.11513* 1.20616** 1.10738* 1.174022* (0.45975) (0.45850) (0.46572) (0.454141) Age 0.02750* 0.02357" 0.014978 (0.01301) (0.01332) (0.013002) Land Holding Size 0.01319 0.008739 (0.02490) (0.024130) Household Size 0.06946 0.049539 (0.06073) (0.058779) Farmer + Entrepreneur 0.625615" (0.376621) Farmer + Social Network 0.916830* (0.372230) Treatment 1 0.123598 (0.370515) Treatment 2 -1.228079** (0.381371) Intercept 4.44405*** 3.62005*** 2.05249* 1.91242* 2.256225* (0.24394) (0.41682) (0.84940) (0.86073) (0.868979) R-squared 0.2594 0.2771 0.2903 0.2963 0.3556 Adjusted R-squared 0.2564 0.2711 0.2814 0.2816 0.3309 F-Statistic 84.78 46.18 32.72 20.05 14.35 p-value 2.20E-16 2.20E-16 2.20E-16 2.20E-16 2.20E-16 Notes: This table reports estimates of the effect of demographic parameters on the ease with which farmers can interpret soil health reports. Each column reports coefficients from a regression of the 'ease' ranking on the education level and successively added controls. The results in columns (2)–(4) are from models that include dummies for a range of demographic variables. Column (5) shows the result of a full regression model including all variables. The sample size is 244. Standard errors are reported in parentheses. Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘"’ 0.1 ‘ ’

Table 4.7 OLS Regression analysis of ‘ease’

Every additional year of education was seen to improve a farmer’s ease ranking by ~0.31 out of 10 on average, which highlights a fairly significant difference between how easy it is for educated versus uneducated farmers to read and understand a recommendation. The gender effect was also highly visible, with male participants scoring ~1.17 out of 10 higher than their female

55 counterparts on average. Age was also seen to play a small but significant role in determining ease rankings. Additionally, coefficients in column (5) of the table indicate that the treatment 2 group had a negative effect on ease, decreasing the score on average by 1.23 out of 10 relative to the control group. This result is significant at the 0.1% significance level, indicating that the additional information on why to carry out a recommendation was not only not beneficial to farmers in that group, but served as a detriment to ease of interpretability. Qualitative interview data corroborated this finding, in that treatment 2 group farmers felt that the recommendations looked content heavy, and it was therefore difficult for them to read through the whole thing and parse the relevant nutrient information from it. Additionally, those farmers that did find it easy mentioned that this was because they were already familiar with the terms of art used and practices described, because of their vast experience in agriculture. Another important lesson that emerges from this set of models is the benefit that the social network environment provides farmers. Relative to a farmer working on his own, a farmer who was allowed to work with his social network (his friends, neighbors, children) to understand the recommendation provided a ranking of ~0.92 out of 10 higher on average. Note that this is even higher than the positive delta in rankings seen in the entrepreneur assisted group, who ranked an average of ~0.63 higher than the individual farmers. This highlights the importance of the interdependent and social environment that characterizes agricultural communities in India. An ANOVA analysis of the full model shown in column (5) indicates that there is strong evidence that parameters edu, and testgrp have non-zero coefficients in the regression, at the highest significance level. Additionally, age, gender, and environment were significant at the 1% level. The resulting best fit model equation for ‘ease’ is of the following form: ease ~ 1 + edu + testgrp + age + gender + env (2) 4.5.3 Model 3: ‘Acquire’

The results of a series of ordinary least square regression estimations on ‘acquire’ are shown in Table 4.8. Coefficient estimates in the table show that all models indicate that education level was the only statistically significant demographic predictor of rankings for availability of inputs at the 1% significance level. Further, the effect of level of education on the ranking was small in magnitude, with every additional year of education improving a farmer’s acquire ranking by only ~0.09 out of 10 on average. Interestingly, an ANOVA analysis of the full model in column (5) of the table indicates that the testgrp variable was a highly significant indicator of a farmer’s perception of the availability of inputs to him. In particular, farmers in treatment 2 were on average likely to have an ‘acquire’ rank 1.81 points of out 10 lower than the control group on average. A potential explanation for this is that the excess of information in the treatment 2 group recommendations might have confounded farmers rather than clarify the recommendation for them, thereby leading them to believe that there was more that they were required to do, and therefore purchase. This once again corroborates the interview data collected, which suggests that a balance of information is needed in a recommendation in order to provide farmers with relevant information, but not in a manner that confounds them. An ANOVA analysis of the full model shown in column (5) indicates that there is evidence that the parameters edu and testgrp alone have non-zero coefficients in the regression. Therefore, the resulting best fit model equation for ‘acquire’ would be of the following form: acquire ~ 1 + edu + testgrp (3)

56

OLS Regression of 'acquire' (1) (2) (3) (4) (5) Education Level 0.08302* 0.08219* 0.086714* 0.08389* 0.094954* (0.03657) (0.03802) (0.039274) (0.04048) (0.039072) Gender (Male) 0.04019 0.062061 0.02138 0.250555 (0.49093) (0.493906) (0.5032) (0.489102) Age 0.006608 0.00419 0.001781 (0.014016) (0.01439) (0.014003) Land Holding Size 0.00002 -0.008938 (0.0269) (0.025988) Household Size 0.04986 0.041473 (0.06562) (0.063304) Farmer + Entrepreneur -0.805862* (0.405616) Farmer + Social Network -0.564614 (0.400886) Treatment 1 -0.798854* (0.399039) Treatment 2 -1.805710*** (0.410730) Intercept 5.29368*** 5.26398*** 4.887283*** 4.779*** 6.040197*** (0.25736) (0.44509) (0.914986) (0.9300) (0.935878) R-squared 0.02085 0.02088 0.02178 0.02433 0.1123 Adjusted R-squared 0.01681 0.01275 0.009557 0.003832 0.07811 F-Statistic 5.154 2.57 1.782 1.187 3.288 p-value 2.41E-02 7.87E-02 1.51E-01 3.16E-01 8.59E-04 Notes: This table reports estimates of the effect of demographic parameters on a farmer’s ability to acquire recommended inputs. Each column reports coefficients from a regression of the 'acquire' ranking on the education level and successively added controls. The results in columns (2)–(4) are from models that include dummies for a range of demographic variables. Column (5) shows the result of a full regression model including all variables. The sample size is 244. Standard errors are reported in parentheses. Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘"’ 0.1 ‘ ’

Table 4.8 OLS Regression analysis of ‘acquire’

4.5.4 Model 4: ‘Afford’

The results of a series of ordinary least square regression estimations on ‘afford’ are shown in Table 4.9. Coefficient estimates in the table show that all models indicate none of the demographic indicators were good predictor of rankings for affordability of inputs. Instead, an ANOVA analysis of the full model in column (5) indicates that the only significant predictors for the outcome of interest were the environment and test group variables. Relative to the control group, both treatment groups resulted in lower rankings of affordability overall. With respect to environments, the social network group was the only group in which farmers felt more

57 comfortable with their ability to afford recommended inputs, relative to farmers working alone. Notably, the low adjusted R2 values of models this case indicates that there are likely to be other factors affecting a user’s rating of affordability that were not captured in this experiment. It is also a possibility that the 0-10 rating in this case is more a measure of a user’s perception of how easy or difficult it would be to afford the required inputs, therefore embodying an underlying valuation of a good rather than the value of the good itself. The model equation that best represents this relationship is as follows: afford ~ 1 + env + testgrp (4)

OLS Regression of 'afford' (1) (2) (3) (4) (5) Education Level 0.03723 0.02900 0.030194 0.029823 0.039792 (0.03340) (0.03466) (0.035822) (0.036888) (0.035462) Gender (Male) 0.39839 0.404159 0.381552 0.439300 (0.44759) (0.450496) (0.458502) (0.443916) Age 0.001742 -0.000906 -0.006709 (0.012784) (0.013113) (0.012709) Land Holding Size -0.008803 -0.015548 (0.024513) (0.023587) Household Size 0.062309 0.050136 (0.059791) (0.057456) Farmer + Entrepreneur -0.619947" (0.368142) Farmer + Social Network 0.465542 (0.363850) Treatment 1 -0.328304 0.362173 Treatment 2 -1.463482*** (0.372785) Intercept 4.90108*** 4.60670*** 4.507383*** 4.363831*** 5.282273*** (0.23503) (0.40580) (0.834567) (0.847387) (0.849416) R-squared 0.005109 0.008369 0.008445 0.01298 0.109 Adjusted R-squared 0.000998 0.0001394 -0.003949 -0.007754 0.07472 F-Statistic 1.243 1.017 0.6814 0.626 3.18 p-value 2.66E-01 3.63E-01 5.62E-01 6.80E-01 1.20E-03 Notes: This table reports estimates of the effect of demographic parameters on a farmer’s ability to afford recommended inputs. Each column reports coefficients from a regression of the 'afford' ranking on the education level and successively added controls. The results in columns (2)–(4) are from models that include dummies for a range of demographic variables. Column (5) shows the result of a full regression model including all variables. The sample size is 244. Standard errors are reported in parentheses. Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘"’ 0.1 ‘ ’

Table 4.9 OLS Regression analysis of ‘afford’

58

4.5.5 Discussion

This section highlights some key findings related to the data set and analysis performed. Model equations (1) through (4) indicate that by far, the education level of a farmer has the most significant impact on his actionability, via its impact on the components of actionability: interpretability, ease, availability, and affordability. Farmers with higher education levels are more easily able to interpret written soil health recommendations, and they find it easier to do so. While this may not sound surprising, it is in fact a significant finding in a demographic in which the average education level is just over the 5th grade level. Note that interpretability of recommendations should not be confused with knowledge of agricultural practices as regards soil health in general. A common observation in Indian agricultural communities is the passing down of traditional knowledge and practices in farming, that allows many farmers to be extremely knowledgeable about their trade, even as they are unable to read and write.

The gender of the participant was also seen to have a large impact on some, if not all, of the components, as was age. Notably, the size of a farmer’s land holdings and the number of people supported in his household did not significantly influence outcomes of this experiment, but were controlled for in the regression analyses in any case. Stratification by farmer typology allowed for these results to emerge without the typology-based bias that we hypothesized. Recall that farmer typology was defined on the basis of a farmer’s education level and the size of his land holdings. The main takeaway from these results is that while farmer typology is a significant consideration for maximization of actionability, its effect is primarily driven by the education component, and not land holdings.

A second empirical result is the relatively better performance of treatment 1 farmers relative to treatment 2 and control farmers, evident from the positive coefficient associated with this variable in the OLS regression tables 4.6 and 4.7. Farmers in the treatment 1 received the “intermediate” level recommendations in terms of amount of information contained – more than control and less than treatment 2. Specifically, recommendations to this group contained basic recommendation augmented with additional information on how to carry out the prescribed action. It is worth noting that treatment 2 was associated with a negative coefficient in all the regression models, indicating that the additional information included in their recommendations (how to carry out prescribed action) actually served to be detrimental to overall actionability. In other words, too much information can be as harmful as not enough.

Finally, the results show that farmer’s interpretability is maximized when farmers are assisted by field agents (“entrepreneurs” in this analysis), or when they work in groups, relative to farmers working on their own. This finding corroborates observations in the field as regards the highly social and mutually supportive nature of rural communities in India.

These findings are relevant in that they directly inform the design of soil health recommendations that would be maximally actionable and therefore beneficial to this demographic. In order to maximize the benefit to users in this context, soil health advisories would need to be (i) targeted at more educated and progressive farmers initially to gain traction, (ii) disseminated to groups of farmers rather than individuals, in order to assimilate the natural social setting, and (iii) targeted

59 at farmer groups containing a mix of education levels, genders, and ages, to leverage the social network to counter the typology-based effects on actionability. 4.6 Formulation of an Actionability Index

Recall that actionability of a soil health recommendation was hypothesized to comprise a linear combination of the following four components: Interpretability of the recommendation, Ease with which a farmer could interpret the recommendation, Availability of the recommended agricultural inputs, and Affordability of those inputs. There are a number of different ways in which these four components can be combined to calculate an Actionability Index. This section assesses the options for such a calculation, with a view to elucidating the most logical and usable interpretation of Actionability in the resource-poor setting. Actionability Index Options The simplest normative index is generated by combining the components described in equations (1) through (4) additively. In other words, this actionability index is formed by adding up each component individually, with each factor equally weighted, as in equation (5). The multiplicative factor of 2.5 on ‘interpret’ is to account for the fact that the scoring mechanism for this component was such that the maximum score possible was 4, whereas the remaining three components were ranked out of 10. The index is scaled such that all possible values lie within a range of 0 to 1. Actionability Index 1 = (2.5*interpret + ease + acquire + afford)/40 (5) The underlying assumption of an index of this form is that all components are equally valid and important in determining actionability, and are therefore equally weighted. We will maintain this notion of all components of the index being equally valid and important even for the two other conceptions of Actionability Index discussed later in this section. Further, an index of this form additionally makes the assumption that actionability shortfalls in each component can be compared one for one with each other. For instance, farmer no. 1 who can easily read and interpret a recommendation, and can afford the recommended inputs, but does not have the requisite transportation to access the fertilizer shop, is comparable to farmer no. 2 who can easily acquire and afford the requisite inputs, but struggles to interpret the soil health recommendation provided to him. In other words, the additive logic of equation (5) indicates that a farmer could have a positive (non-zero) actionability index even when they have no literacy and cannot interpret the recommendation whatsoever, or when they have no means of accessing the recommended fertilizer, or they are impoverished and cannot afford it. It is likely that this notion is a fair representation of reality, since observational data indicates that farmers tend to work closely with and depend heavily on their respective social networks. In this context, what that means is that farmer no. 2 who struggles to interpret a soil health recommendation on his own can rely on his children, neighbors, and peers in his village to help him understand the recommendation, and farmer no. 1 who cannot access the fertilizer distributor on his own could do so by jointly renting transportation with his network of farmers (a common practice). In light of this heavy reliance on an individual’s social network in this setting, one could then argue that the ranking for ‘ease’ is not a relevant contributor to actionability. If interpretability of a recommendation is achieved or maximized via the support of a social network, then an individual’s level of ease of doing so is rendered redundant. Moreover, the notion of ‘ease’ of acting upon the offered recommendation may intuitively appear more subjective than the need

60 to interpret, access, and afford what is being recommended. This leads to a second normative formulation of an actionability index as follows: Actionability Index 2 = (2.5*interpret + acquire + afford)/30 (6) The previously listed assumptions still apply in this case, the only difference being that ‘ease’ is posited not to have a strong influence on actionability overall. A third formulation of the index can be based on the logic of how an individual farmer may act upon the recommendation in real life, where they must interpret the recommendation, the recommended input must be available to them, and they must be able to afford it – indicating a multiplicative logic where all three actions must take place as shown in equation (7). This index is similarly scaled to fall within a range of 0 to 1. The underlying assumption behind an index of this format is that while all components are equally important and valid, their effects are compounding rather than independent. For example, an illiterate and socially isolated farmer with a very a poor interpretability score (say 5 out of 10) brings down the overall actionability index by 50% relative to a score of 10, which in essence captures the criticality of the ability of a farmer to be able to interpret the recommendation provided to him. The same comparison is true of the remaining components as well. In an extreme situation, this index goes to zero when one or more components reach a value of zero. While this index will undoubtedly result in a much lower average actionability score overall, it potentially captures the true nature of factors on the ground that might explain why the current soil testing scenario in India remains so severely underutilized. Actionability Index 3 = (interpret * acquire * afford)/400 (7)

Statistical Comparison of Actionability Indices The actionability index formulations from equations (5) through (7) were applied to the data and each assessed based on correlation with key demographic and experimental variables, identified as explanatory in the previous analysis of individual components. The model equations for individual components identified the following set of features as most explanatory in the data set: education, gender, test group, and environment, in addition to typology, which was the basis for stratification in the data set. Each of the three actionability indices was evaluated against these five variables, in addition to ‘interpret’, as this is the most objective component of actionability measured in the field experiment. Visual and tabular representations of the correlation between the selected variables and the three chosen actionability indices are presented in appendix D. Indices 1 and 2 show a clear positive correlation between actionability and interpretability, which is as one would expect. Index 3 shows a similar positive correlation, however the compounding nature of the formulation causes a large number of participants to be assigned a zero or close-to- zero value for actionability. This results in a large number of users being assigned a value of zero in spite of a perfect interpretability score of 4. All three indices show a weak positive correlation with education, which is corroborated by interview data: many farmers were extremely knowledgeable about soil health and related agricultural practices, in spite of having little or no education. One would therefore expect some level of correlation of actionability with education (a dummy measure of literacy), but not expect education level to be excessively explanatory on its own. Notably, index 3 shows a very low level

61 of correlation with education level, except for a large concentration of individuals with no education and an index score of 0. The variation with typology was largest in magnitude for indices 1 and 2, and less so for index 3, again due to the compounding effect that results in extremely low actionability scores on average. Index variations across test group and environment were comparable for all three indices, with index 1 showing slightly more variation across environments. A comparison of index variation across gender showed that on average, females scored approximately 0.12 out of 1 less than their male counterparts on actionability, which corroborates the findings from the data. Notably, indices 1 and 2 generated stable results across all parameters, while index 3 resulted in a larger number of outliers when compared across significant parameters. It is therefore clear that actionability index 3 would not be a good choice for an informative index that explains the variation in the dataset. The choice therefore comes down to indices 1 and 2. While these two are similar to each other, actionability index 1 does not provide much more than index 2 in terms of explanatory power, but introduces the rather subjective parameter, ‘ease’, that cannot be assessed objectively when farmers are registered in the database in the way the other three parameters can (interpretability, availability, and affordability). Therefore, actionability index 2 is the most logical choice for an index that captures the underlying behavior of users, providing an objective means for assessing its components. The following points present a summary of trends that emerge from the comparison of actionability indices: 1. Typology 1 and 2 farmers perform better than other typologies. 2. The ‘farmer + social network’ group has a better mean actionability index than ‘farmer’ or ‘farmer + entrepreneur’, but has greater variance in score. ‘Farmer + entrepreneur’ has the lowest variance overall. This is an interesting insight, in terms of the benefit (albeit highly variant) that a farmer’s social network can provide him, in contrast to the more reliant (less variant) but lesser benefit of external assistance. 3. Females have a systematically lower actionability index, which could potentially be explained by gender segregation in general. 4. Actionability index 3 scores are far lower on average as previously discussed. 5. Indices 2 and 3 are proportional to each other in their ordering of farmers by actionability score, visible by the fact that they both rise together. 6. Actionability index 2 captures a greater degree of variation between farmers in different environments as well as test groups. This is visible in Figures 4.2 and 4.3, where the three lines corresponding to index 3 are flatter close to 0, indicating that the bottom ~ 20% of farmers are indistinguishable from each other when using index 3, since they were all assigned an actionability score of 0. 7. In contrast, closer to 1 on the actionability index scale, the lines corresponding to index 3 rise sharply, implying that as we reach the upper subset of better performing farmers, index 3 exposes a far greater distinction among them than does index 2.

62

Note: F: Farmer, FSN: Farmer + Social network, FE: Farmer + Entrepreneur Figure 4.2 Comparison of Actionability indices 2 and 3 partitioned by Environment

Note: C: Control, T1: Treatment 1, T2: Treatment 2 Figure 4.3 Comparison of Actionability indices 2 and 3 partitioned by Test Group

63

Implications of the choice of index Index 3, which is formulated by the product of the four components of actionability, was rejected for primarily two reasons: (i) it did not show stable correlation with significant parameters, and (ii) the compounding effect led to very low actionability scores on average, which did not quite match the experience of farmers in the experiment. Comparing formulations of index 3 to index 2 offers us an interesting qualitative insight: a formulation like index 3 would be more conducive to a farmer working in isolation, or someone who is educationally poor or socially isolated. In reality, the social setup of the experiment which mirrors the social nature of communities, takes this stark notion of index 3 and transforms it into the additive notion embodied in the chosen index 2. In other words, the social setup that allows farmers to work with each other and to leverage shared knowledge, allows a tradeoff among components of actionability to the benefit of all involved. The comparison of indices 2 and 3 therefore provides a mathematical intuition for why self-help groups and social networks are beneficial, since individuals are able to underwrite each other’s risks. An appropriate analogy is the provision of micro credit in rural communities. In these high risk situations of lending, lenders ask for information about an individual’s social network as a way to mitigate risk. In contrast, larger banks in general do not ask about an individual’s community, and instead collect a plethora of information specific to the individual client. In soil health, current database designs mimic the larger bank approach, focusing on highly detailed user demographic information and information related to his land and crops. An arguably better approach accommodates the additional social network considerations that mimic the micro-finance approach to risk, and that follows from the formulation of the chosen actionability index 1. 4.7 Lessons for Database Design

The analysis of demographic and experimental data described in this chapter thus far have led to a number of interesting observations. These observations inform our design of a soil health recommendation system, in terms of format, content, and presentation of a recommendation to the target user group (small holdings rural Indian farmers). At the core of this soil health recommendation system is the soil health database previously introduced in Chapter 3, and elaborated on in Chapter 5. This database provides an integrated system for farmers to record results of soil tests, receive recommendations, and track information over time. The schema of the underlying tables in the database describe the information that is used to formulate the recommendation – it is this information that determines the level of actionability that each recommendation generates. Results from this analysis may therefore be used to tailor the schema, and therefore the specific recommendations that the farmer receives in order to maximize actionability. A number of lessons emerge from the analysis described in this chapter, which are described herewith. Test group, environment, education level, and gender were found to be the four most significant predictors of actionability for farmers. Of these, test group and environment are effectively the two variables over which we, as an external entity, have some control over as we design a recommendation system for this user group. Test group here acts as a dummy variable for amount of information provided, with control representing the minimum level of information provided, and the two treatment groups adding successively more information. Analysis of the data shows that the Treatment 1 group received the highest actionability scores on average as compared to the other two test groups. What this tells is that there is such a thing as too much

64 information, and recommendations needs to be designed optimally such that key information is not lost among details that are unnecessary, and at times intimidating. Qualitative interview data with farmers revealed that many of them found the large amounts of text daunting, and seemed to be more likely to want to give up on the exercise when the text of the recommendation spilled over onto a second page. Further experimentation may reveal the sweet spot in terms of adequate amount of information that leads to greatest actionability. Another key finding from the data was the success of the ‘farmer + social network’ environment of interaction, as apparent from their higher actionability scores: farmers working with their social network scored 0.64 on average, as compared to 0.59 for farmers working alone, and 0.61 for farmers working with an entrepreneur. This finding directly feeds into the delivery mechanism planned for the recommendation system. If farmers work better in groups, it makes sense to provide recommendations to them along with information about other farmers in their area and/or farmers who are growing similar crops to them. In the database, a farmer’s account information, biographical information, and demographic information are stored as separate tables, connected through the key of the user’s ID. After a farmer has created an account, he can add information about his individual farms. Because his farms may not be in the same location as his primary home address, he adds address information about the location of the specific farm. There are optional fields to add GPS coordinates for the farm that can be based either on IP geolocation, or more accurately, a cartographic analysis of the land. Additionally, the farmer inputs the size of the plot, so that the database can accommodate the relative importance of this particular plot of land in proportion to his total landholdings. Storing the farm information provides a convenient way to keep track of the test results specific to each farm over time, as well as provides a means to link relevant farmers to each other. The remaining two features – gender, and education level – are in essence, features of a population that determine actionability, but over which we may assert no control. The typology based stratification also controlled for typology-specific variation among sub groups in the experiment, which is another factor of the population over which we would have no control over in reality. However, the results of this analysis on the data still provides insight into how these factors can be used to design recommendations or form social clusters in such a way that they minimize the detrimental impact of these factors for an individual farmer and maximize his actionability. Typology 1 farmers, that is, those with a high level of education and relatively large land holdings, were seen to have a better actionability score on average compared to other typologies. Actionability levels consistently declined on average for subsequent typologies, with Typology 5 (arguably, the worst off – landless laborers) scoring the lowest. This result indicates that typology-based effects, which were predicted to be relevant in this context, are indeed so and must be taken into consideration when organizing group-level recommendation delivery. A typology-based approach to recommendation generation is described in detail in Chapter 5. Education levels were also seen to be weakly correlated with actionability scores, which is not a surprising result. In addition, male farmers were seen to consistently score higher on actionability when compared to their female counterparts. These results provide some guidance in the formulation of the “social network” groups within which farmers discuss and understand recommendations. Pursuant to the analysis detailed in this chapter, the database has the capacity to store and utilize location specific information with which to link farmers to one another. These findings further augment that design with guidelines on how to form impactful groups: select farmers with a good mix of genders, education levels, and typologies.

65

Chapter 5

Soil Database and Recommendation Engine

This section describes in detail the structure and function of the soil health database that forms the core of the recommendation generation system. This chapter is written based on work jointly done with Leah Slaten, an undergraduate student at MIT’s Department of Computer Science, in her role as an undergraduate research assistant to the author during the months of January through May 2016. The chapter starts with a broad functional overview of the soil database system, followed by an in depth examination of the database schema. Subsequently, the logical algorithm for customized recommendation generation is described. The chapter concludes with a summary of future functionalities planned for the soil database and recommendation engine, and potential avenues for improvement. 5.1 Functional Overview

The database provides an integrated system for farmers to record results of soil tests, receive recommendations, and track their farm and test related information over time. Upon creating an account, the farmer enters his biographical information such as name and address, as well as some demographic information, including gender, number of dependents (size of household), total landholdings size, and educational level into the system. This demographic information may be used to tailor the detail in the recommendations that the farmer receives. An analysis of the data from the field experiment described in the preceding chapter indicates that farmers of different typologies react differently to the level of information detail in a recommendation. For certain types of farmers, providing additional information on how to carry out a recommendation in addition to what to do, improved average levels of actionability. On the other hand, providing certain other types of farmers with the same level of detail served only to confound him and make the recommendation less actionable. Keeping track of the size of a farmer’s landholdings as well as the level of fragmentation is important, as they serve as indicators of how important each land segment is to the farmer, both in terms of revenue as well as productivity over time. The set of biographical and demographic data collected for the database from each farmer is therefore a direct result of the analysis described in the preceding chapter that describes the set of demographic factors that may be considered important in this context. 5.2 Database Schema

Figure 5.1 shows the schema of the soil health database, which describes the logical view of the entire database. Data is organized into tables, which are further organized into logical sections based on function. The six inter-related sections are as follows: user information, business information, test information, store details, recommendation, and supplementary.

66

Figure 5.1 Schema of Soil Health Database

The “User Information” section stores information about various users of the system. The database system is designed for use by three different types of users: system administrators, farmers, and store owners. In the database, a user’s account information, biographical information, and demographic information are stored as separate tables, connected to each other through the unique system generated user identification number. Storing this information in separate tables affords flexibility, as it allows the system to create different types of users, each of whom might not require the same treatment in the system. When creating an account, administrators are prompted to provide information about themselves that will be stored in the

67

“user_properties” table. Further biographic and demographic information is unnecessary for them. A system administrator has the capacity to add and remove users, modify the contents of the database, and track database contents over time. A store owner who is creating an account will be asked to provide, in addition to the basic account information, information about their store that will be stored in the “biographic_information” table. The store address is a useful resource for connecting nearby farmers and store owners. The primary user of this system, the farmer, is asked to provide information for both the previously mentioned tables, in addition to more detailed demographic information that is stored in the “demographic_information” table. This table includes fields such as gender, number of dependents, total landholdings size, and educational level. These additional details are useful in tailoring recommendations to specific users in order to make it the most actionable for them. All the tables in the “User Information” section of the database are linked by the unique system generated identification number that is stored in the “user_id” field in each table. The “Business Information” section stores information about farms and stores. Farmers can add farms to their account (by populating the “farm_properties” table), and in order to do so they must enter the address and size of each farm holding. Since his farms may not be in the same location as his primary home address, a farmer adds address information about the location of the specific farm separately from his home address. The table structure requires that farmers enter the address in the form of GPS co-ordinates of the four corners of each farm they own in order to be able to map geographical locations relative to stores and other farms. We envision this capability to be incorporated into the final tool (a mobile phone or a tablet) taken to villages in India and used to create user accounts there. Interview data from the experiment described in the previous chapter indicated that many farmers own more than a single plot of land, and this table is useful for keeping track of all the land fragments controlled by a single farmer. The farmer is additionally required to input the size of each plot of land, so that the database can understand the relative important of each plot of land in proportion to his total landholdings. Storing the farm information in this way provides a convenient way to keep track of the test results specific to each farm over time. In a similar way, a store owner can control multiple stores, the information for which is entered in the “store_properties” table. This section thus makes it easy for users to add or remove businesses at any time without needing to entirely recreate their accounts. The “Store Details” section of the database allows store owners to record the inventory of each store in the database. Again, storing this information separately from basic store details affords the store owner greater flexibility in adding and removing stock. The “Test Information” section is the integral piece of the user end of the database system. Within this section, there are two tables: “soil_test” and “test_result”. The “soil_test” table stores information about the test: the date the test was performed, the crop being grown on the land being tested at that time, and so on. The “test_result” table stores new rows to represent each test. Each nutrient result is stored in a separate row (i.e. N, P, K, and pH values), and rows common to a single test are linked by a unique test identification number in the “test_id” field. A farmer may use one plot of land to grow different crops during different seasons, so it is important to store this as information specific to each test rather than as information specific to the farm. Additionally, the water source may vary depending on the crop that is being grown. All of this test information is used to compile a full recommendation for the farmer on the basis of his entered test result. The “Recommendations” section is that critical piece of the system that enables value delivery to

68 the user. This forms the core of the back end, and parses farmers’ test results to generate actionable recommendations. The basic recommendation table provides recommendations based on crop type, crop variety, soil type, water source, etc. and returns a recommendation that specifies the amount of each nutrient needed for optimal yield. Since some fertilizers work best with multiple rounds of application on different dates, the soil database includes support for this format of recommendation. This table additionally includes fields “how” and “why” that contain information that could be used to explain to farmers how to carry out a specific recommendation, and why they will benefit from following it. The “recommendation_history” table copies information from the overall “recommendation” table, linking it to a specific test, in order to keep track of the recommendations given to farmers for each test. Just as in the “recommendation”, each test will have a “recommendation_history” row for each nutrient. The “recommendation_survey” is a table that stores farmers feedback on each test and recommendation, asking them to explain if they followed the recommendation, and if not, why not. This table is potentially useful to administrative users in optimizing the recommendations given to farmers over time, in order to render them maximally actionable. The “Supplementary” section provides additional information to support the recommendation, including a conversion table from fertilizer to nutrient quantity, and a table explaining what symptoms to expect in various crops in case of nutrient deficiencies. Together, all of these sections form a complete database system that can provide farmers with actionable recommendations for fertilizer applications based on the results of their tests. The framework also includes additional support for fertilizer store owners, with a view to further developing the system to allow for interaction between farmers and store owners that produces benefits for both parties. 5.3 Recommendation Generation

The primary value add of the system is rooted in the result entry and recommendation generation process. Farmers input information about their soil tests and corresponding results, and in return receive recommendation information specifically tailored to their needs. There are several tables related to the recommendation. The most important two are the table providing recommendations by nutrient level and the table that stores recommendation history. The former table is used to generate a recommendation based on test results. This information is then transferred to the latter table so that the farmer can review previous recommendations. Recommendations vary based on a number of factors, including crop (and its specific variety, if applicable – if no variety is entered, the database defaults to the most common variety of the particular crop), water source (rainfed or irrigated), and amount of rainfall in the region. We use a default yield goal for each type of crop variety, based on standard documented yields for each crop. For each variety, this is calculated as the average of the standard yield range for the crop under specific conditions. The recommendation based on nutrient levels table additionally stores information about the required amount of each nutrient in the standard unit of kilograms per acre. It also contains specific dates of application, and describes how much fertilizer should be added at each specific time and by what method. The majority of commonly grown crops in India were found to have a maximum of three split applications, so the database accommodates three such fields for each recommendation. Non-mandatory fields that are unused are left blank. In order to calculate how much fertilizer should be added, the recommendation engine subtracts

69 the results of the soil test from the ideal amount of nutrient. This represents the amount of the nutrient that needs to be added to the soil. It is important to note that every fertilizer has a different proportion of nutrient content, so this number must be converted to understand how much of the specific fertilizer must be added. To simplify this conversion process, the database contains a table of fertilizers, which is used to convert the raw number into a fertilizer amount. The fertilizer table stores fertilizer names, and the corresponding N-P-K ratio of the fertilizer (as many fertilizers are a combination of nutrients). For example, urea, which is a nitrogen based fertilizer, would be stored in this table with a ratio “46-0-0”, meaning that urea contains 46% of Nitrogen, and 0% of Phosphorous and Potassium respectively. Therefore, if a field required 46 kilograms per acre of Nitrogen, one would be required to apply 100 kilograms per acre of Urea on that field. The recommendation engine uses this ratio from the fertilizer table to extract a conversion factor to convert the raw quantities of each nutrient to the amount of fertilizer required. The pH recommendation works slightly differently. If pH is below the recommended level, it recommends liming to increase the pH. If pH is above this level, it recommends application of gypsum.

User-Centric Recommendation Design Right from the problem finding phase of this research project, we have been focused on designing solutions with the end user in mind. The field experiment described in chapter 4 revealed a number of important considerations for designing a soil health recommendation system, at the center of which is the small-holder Indian farmer. In keeping with those findings, the recommendation database not only incorporates data fields in the tables that were found to affect user’s actionability, but also incorporates a logic that tailors recommendations to user’s specific needs on the basis of relevant factors. This algorithmic approach to designing custom recommendations is described here. As the final form of the algorithm requires experimentation with the point-of-use sensor we are developing, the discussion in this section should be received as an illustrative example. For example, Figure 5.2 shows the variation of mean actionability scores across test groups for each typology. Recall that test groups represent the amount of information that is provided to a farmer – the control group represents the basic recommendation telling the farmer what to do, treatment 1 incorporates additional information on how to do it, and treatment 2 further describes to the farmer the rationale behind why taking the recommended action is beneficial.

70

Note: Typology 1: >5 edu, ≤3ha; Typology 2: >5 edu, >3ha, Typology 3: ≤5 edu, ≤3ha; Typology 4: ≤5 edu, >3ha; Typology 5: 0ha (landless)

Figure 5.2 Variation of mean Actionability scores across Test Groups, grouped by Farmer Typology

Figure 5.2 shows that actionability scores are very closely linked with the type and amount of information provided to farmers. Further, the impact that different types of information have on a farmer’s actionability score is highly dependent on what “type” of farmer he is. For instance, the graph tells us that the treatment 2 group performed consistently badly in terms of actionability scores overall, indicating that farmers did not find it helpful to know the rationale behind the formulation of a recommendation (the why), regardless of farmer type. In fact, the numbers show us that it only served to confuse them. However, it is worth noting that Typology 1 and Typology 3 farmers (both low land holdings owners) showed the biggest drop in actionability in the treatment 2 group compared to treatment 1, implying that they were the most adversely affected by this excess of information as compared to farmers of other typologies. While nearly all typologies benefited from the additional information provided in the treatment 1 group (except typology 1 farmers, who were slightly disadvantaged, if not indifferent), Typology 2 and Typology 4 farmers (those with high land holding sizes) saw a larger rise in mean actionability than their counterparts with lower average land holding size. Typology 1 and 2 farmers (those with high education levels) were seen to score consistently higher actionability scores than their counterparts with fewer years of education. Although lessons from Typology 5 farmers (landless laborers) do not necessarily affect the design of a recommendation system, since they are not the primary intended users of the system, it is nevertheless interesting to note that this group scored the lowest of all on the actionability scale. Typologies were defined prior to the start of the experiment, and this fifth category was included on the premise that not owning any land was

71 very likely to have a negative influence on a farmer’s ability to interpret and act upon soil health recommendations, regardless of education levels. The confirmation of this assumption in the data is corroborated by qualitative interview data from the experiment, in which farmers revealed that their familiarity with fertilizers and soil health jargon was what made the exercise easy for them to understand and follow. Landless laborers generally work on contract to perform manual labor on another’s land, and are unlikely to be involved in or aware of higher level decisions affecting land owners, such as those regarding the quality of the soil and fertilization and irrigation decisions. It is therefore not surprising that Typology 5 farmers are seen to have the lowest actionability scores overall. The lessons from the observed typology-based differences in actionability across farmers can directly be accommodated in the logic of the recommendation generation engine. Figure 5.3 shows a part of the decision tree algorithm dealing with the above insight to generate user-centric customized recommendations for farmers on the basis of education levels and land holding sizes (which in turn define typology as we have used it for this study). The graph in figure 5.2 informs the typology-based decisions at each step in the decision tree. The system’s most basic recommendation is based on the recommendations generated and used by soil health labs in KVK’s in the state of Karnataka. The algorithm therefore starts with this basic format, and subsequently adds information to it that will make it more actionable for farmers of different types.

Figure 5.3 Decision tree algorithm for generation of customized recommendations

Farmers with low education and high land holdings (Typology 4) were seen to benefit the most from the augmented recommendations in the treatment 1 group, that is, recommendations augmented with information on how to carry out a particular action recommended. The interview data suggests that for farmers of this type, more detailed instructions are beneficial, therefore, the

72 recommendation engine generates an augmented recommendation format for these farmers containing additionally detailed instructions on how to carry out the prescribed actions. A sample recommendation report of this kind is shown in Figure E.1 in appendix E. Farmers with high education and high land holdings (Typology 2) were seen to score high actionability scores on average. Additionally, they were the only typology that did not see a large drop in actionability scores in treatment 2 as compared to treatment 1, rather, as the graph in figure 5.2 depicts, farmers of Typology 2 found augmented recommendations to be as actionable as their simpler counterparts. In cases such as this where the data shows user indifference between two types of recommendations, the algorithm defaults to providing more information rather than less. Therefore, for farmers of this category, the recommendation engine generates an augmented recommendation format for these farmers containing additionally detailed instructions on how to carry out the prescribed actions, as well as why these actions would be beneficial to them. A sample recommendation report of this kind is shown in Figure E.3 in appendix E. Finally, for the remaining types of farmers, the data conclusively shows that augmented recommendations unquestionably improve actionability overall, so the recommendation engine generates augmented recommendations containing information on what action to take and how to take it. A sample recommendation report of this kind is shown in Figure E.2 in appendix E. Thresholds for definitions of farmer typology are currently based on experimental data from the farmer population in Ballarawad, Karnataka. Should the need arise, the recommendation system offers an administrator the flexibility to alter the cutoffs for education and/or land holding size, such that this system can be tailored to meet the needs of the specific population being targeted. For instance, if this solution is being implemented in a state in India that does not match the average demographic statistics of Karnataka, these cutoffs may be altered to reflect the average state of the farmers in that area, for that specific instance of its implementation. In addition to typology based customization of recommendations as described above, the recommendation engine also incorporates some key information deemed to be useful for farmers in general. Each recommendation contains an “Additional Information” section that provides information on what action farmers should take in very rainy conditions, as this is something that often concerns farmers and incites them to add more fertilizer than recommended (which is an unnecessary cost). Additionally, this section provides warnings on the symptoms of low nutrient levels, to allow farmers to diagnose problems in their crops at an early stage. As this project moves into its next phase, research continues on completing and refining the algorithm. 5.4 Future Improvements

In its current form, the database currently has all the basic functionality that is required to provide farmers with recommendations based on their soil test results, and tailored to their specific typologies. Going forward, there are a number of improvements we envision that will make the system more useful to farmers, which include sending farmers timed notifications reminding them when to implement each part of the recommendation, as well as longer-term recommendations about how to rotate their crops, when to sow and plant seeds, as well as recommendations for best crops to grow conditional on soil type. More broadly, we would like to augment the system with features that allow a deeper level of customization for farmers. Detailed information on fertilizer availability within a region will allow the recommendation engine to tailor soil health recommendations to fertilizers that are available in those regions. Implementing this functionality would involve expanding the current

73 system to include storeowners as another category of user. Note that while the database schema already contains the data field necessary to implement this, the recommendation generation engine would need to be updated to include this logic. Storeowners would be allowed to upload and modify their inventories in the system, and farmers would thus receive recommendations that capture the most up-to-date information on the fertilizers available in their region, thereby maximizing overall actionability. This would require large-scale expansion of the project such that many stores buy into the system, but we hope that it would go far towards increasing the actionability of these recommendations for farmers, as well empowering farmers by connecting them directly with other stakeholders in the soil health value chain.

74

Chapter 6

Conclusion and Future Work

Smallholder farmers in rural India face a number of different problems of varying priority, a major one of which is the management of the health of their soil. Although a range of soil testing options is available to them, none have been successful in providing farmers with available, usable, affordable, and actionable information to aid their decision making. This research has made progress towards addressing that need, with the design of a robust information system to accompany a novel point-of-use soil testing technology developed at MIT.

A combination of field research, interactive experiments, and statistical analysis of resulting data has revealed that in order to maximize the actionability of soil health recommendations for rural smallholder farmers, a solution must satisfy the following broad set of criteria:

1. A successful solution must be designed to provide value at the point of use, that is, at the user’s farm. 2. Typology based differences across farmers indicate that maximum actionability may be achieved when recommendations are customized to the specific needs of a user. 3. A fine balance must be achieved in terms of the complexity of information provided to farmers in recommendations. Analysis of experimental data reveals that an excess of information can be detrimental to a user’s ability to interpret and benefit from soil health advisory. 4. Group-based advisory dissemination, which mirrors the natural social order of rural agricultural communities in India, will allow for maximal user engagement with the system.

A novel point-of-use soil sensor and recommendation system have been designed with consideration of these criteria. The resulting recommendation generation engine and core database structure described in Chapter 5 represent the final product of this research. While this design is by no means complete, it represents a step in the right direction to empower the farmer with information about his soil and crops, so that he can make informed fertilization and irrigation decisions to improve crop yields.

Some methodological lessons also emerge from this research. As previously described, the exploration of the problem area started jointly, to later diverge into independent explorations of technology and recommendation systems, each of which were designed, optimized and tested independently. The results of these independent explorations are now at a stage where they may be combined into a complete system, ready for an end to end test. This approach to combined research involving cycles of convergence and divergence has proven effective in terms of extracting joint lessons that affect the design of the technology as well as the recommendation system. Notably, it is the joint findings from the convergence stages that dictate the direction of each branch of the research. Thus, an important takeaway from this method as it applies to future work is to accelerate the process of joint problem discovery such that more time and effort may be spent in the divergent phases of the research, which arguably generate the most value for the end user.

75

The information system for recommendation generation described in this thesis offers value to a number of different stakeholders. If such a system were available along with a point-of-use sensor, it would be of value to corporate farming entities, agricultural NGO’s, and farmers; as well as additionally serve as a mechanism for meeting the soil testing needs of the nation. The provision of information on soil health indicators of this kind are complementary to the current policy goals of the Indian government, in particular, the National Soil Health Card Scheme, that aims to curb fertilizer overuse by improving farmer’s knowledge of their soil health.

Future Work

This research represents the beginning of a solution to the problem of poor soil health and information for smallholder farmers in rural India. The next set of enhancements that can be made to the system might therefore focus on augmenting the system with features determined to provide a tangible benefit to the end user. From the perspective of the device, this could include the development of additional testing channels to test other important indicators of soil health such as micronutrients and minerals present in water for irrigation. The recommendation system design is also amenable to generalization for advisory design in a number of applications in rural settings, such as advisory related to the quality of drinking water, milk, and others.

In translating the products of this work into an end to end system on the ground in India, a number of considerations come to mind. Importantly, how could such a system coexist with government and private players in this space? The answer to this question requires further thought and consideration of various commercialization channels. An analysis of policy and user needs gaps indicates that poor soil health is in fact a huge concern for farmers in India. Further, initiatives like the national soil health card scheme indicate that this need has been recognized by the Government of India, who have taken steps to try and tackle the issue. A number of private players have also entered this market to try and address the issue of poor soil health and productivity, such as Tata Consultancy Services with their mobile platform for farmers “mKrishi”. One potentially viable channel to effectively engage with farmers is via an NGO working in the agriculture sector, or one working with large numbers of farmers within a given area. Groups such as agricultural NGO’s and corporate farming entity firms provide the benefit of a trusted brand name and established presence within farming communities, which is of great value when introducing a new technological solution. Multiple channels are therefore available to take such a solution to the end user, and it remains to be determined what the most effective strategy will be.

76

Bibliography

Abbas, R. and Varma, D. (2014, March 03) Internal Labor Migration in India Raises Integration Challenges for Migrants, Migration Policy Institute, Retrieved from: http://www.migrationpolicy.org/article/internal-labor-migration-india-raises- integration-challenges-migrants Arora. 2013. “Agricultural Policies in India: Retrospect and Prospect.” http://www.indianjournals.com/ijor.aspx?target=ijor:aerr&volume=26&issue=2&articl e=001&type=pdf. Ashra, Sunil, and Malini Chakravarty. 2007. “Input Subsidies to Agriculture: Case of Subsidies to Fertiliser Industry across Countries.” Vision: The Journal of Business Perspective 11 (3): 35–58. Chand, Ramesh. 2004. “India’s National Agricultural Policy: A Critique.” Indian Journal of Agricultural Economics 64 (2): 164–87. Damodaran, H. (2015, Jan 07) Rural wage growth lowest in 10 years, signals farm distress, falling inflation, The Hindu, Retrieved from: http://webcache.googleusercontent.com/search?q=cache:http://indianexpress.com/ar ticle/india/india-others/rural-wage-growth-lowest-in-10-years-signals-farm-distress- falling-inflation/ DOA, 2011 Department of Agriculture & Cooperation – Ministry of Agriculture: Soil Testing in India (Methods Manual) DOA, 2014 Agricultural Census 2010-11, All India Report on Number and Area of Operational Holdings. Retrieved from http://agcensus.nic.in/document/agcensus2010/completereport.pdf Doran, John W. 2002. “Soil Health and Global Sustainability: Translating Science into Practice.” Agriculture, Ecosystems & Environment 88 (2): 119–27. Doran, John W., and Michael R. Zeiss. 2000. “Soil Health and Sustainability: Managing the Biotic Component of Soil Quality.” Applied Soil Ecology 15 (1): 3–11. Good, A. G., & Beatty, P. H. (2011). Fertilizing Nature: A Tragedy of Excess in the Commons. PLoS Biology, 9(8), e1001124. http://doi.org/10.1371/journal.pbio.1001124 Goswami, Rupak, Soumitra Chatterjee, and Binoy Prasad. 2014. “Farm Types and Their Economic Characterization in Complex Agro-Ecosystems for Informed Extension Intervention: Study from Coastal , India.” Agricultural and Food Economics 2 (1): 1–24. Hamoudi, A., M. Jeuland, S. Lombardo, S. Patil, S. K. Pattanayak, and S. Rai. 2012. “The Effect of Water Quality Testing on Household Behavior: Evidence from an Experiment in Rural India.” American Journal of Tropical Medicine and Hygiene 87 (1): 18–22. doi:10.4269/ajtmh.2012.12-0051. Haq, Z. and Choudhury, G. (2015, Oct 10) Rural Income hits a low, The Hindustan Times, Retrieved from: http://www.hindustantimes.com/india/india-s-rural-crisis-slowed- farm-growth-may-hurt-7-5-gdp-dream/story-OfW6MAu0VBPKW90nkmTQ2N.html IARI, 2007. IARI Perspective Plan, Vision 2025, Indian Council of Agricultural Research. Retrieved from http://www.iari.res.in/download/vision/vison-2025.pdf IBEF, 2013. “Agriculture Status Report August 2013”, IBEF, Retrieved from: http://www.slideshare.net/IBEFIndia/agriculture-august-2013-26566398 ICAR, 2010. Indian Council of Agricultural Research. Retrieved from http://www.icar.org.in/en/krishi-vigyan-kendra.htm

77

IPNI, 2016. International Plant Nutrition Institute – About IPNI. Retrieved from http://www.ipni.net/about Kaushik, P.D, and Nirvikar Singh. 2004. “Information Technology and Broad-Based Development: Preliminary Lessons from North India.” World Development 32 (4): 591– 607. doi:10.1016/j.worlddev.2003.11.002. Kumar, Richa. 2004. “eChoupals: A Study on the Financial Sustainability of Village Internet Centers in Rural Madhya Pradesh.” Information Technologies and International Development 2 (1): 45–74. Lucas, Patricia J., Christie Cabral, and John M. Colford. 2011. “Dissemination of Drinking Water Contamination Data to Consumers: A Systematic Review of Impact on Consumer Behaviors.” Edited by Antje Timmer. PLoS ONE 6 (6): e21098. doi:10.1371/journal.pone.0021098. Manjunatha, A.V., Asif Reza Anik, S. Speelman, and E.A. Nuppenau. 2013. “Impact of Land Fragmentation, Farm Size, Land Ownership and Crop Diversity on Profit and Efficiency of Irrigated Farms in India.” Land Use Policy 31 (March): 397–405. doi:10.1016/j.landusepol.2012.08.005. MOA, 2013 Ministry of Agriculture: State of India Agriculture 2012-13 MOF, 2015 Ministry of Finance Economic Survey 2013-14 – “The State of the Economy” Retrieved from: http://indiabudget.nic.in/es2013-14/echap-01.pdf MSSRF, 2014. MS Swaminathan Research Foundation. Retrieved from http://www.mssrf.org/?q=node/22 Parsai, G. (2015, Jul 17) Soil Health Card scheme takes off gingerly, The Hindu, Retrieved from: http://www.thehindu.com/news/national/soil-health-card-scheme-takes-off- gingerly/article7431232.ece Patel, A. (2015, July 24) Soil Health Management – India Needs a Strategic Action Plan, The Journal of Rural Development Retrieved from: http://www.rural21.com/english/news/detail/article/soil-health-management-india- needs-a-strategic-action-plan-00001541/ PM India, (2015, Feb 19). PM launches ‘Soil Health Card scheme’, presents Krishi Karman Awards from Suratgarh, Rajasthan. Retrieved from: http://pmindia.gov.in/en/news_updates/pm-launches-soil-health-card-scheme- presents-krishi-karman-awards-from-suratgarh-rajasthan/ Praveen, K.V. 2014. “Evolution and Emerging Issues in Fertilizer Policies in India.” Economic Affairs 59 (2): 163. doi:10.5958/j.0976-4666.59.2.016. Puri, Satish K. 2007. “Integrating Scientific with Indigenous Knowledge: Constructing Knowledge Alliances for Land Management in India.” MIS Quarterly, 355–79. Rao, N.H. 2007. “A Framework for Implementing Information and Communication Technologies in Agricultural Development in India.” Technological Forecasting and Social Change 74 (4): 491–518. doi:10.1016/j.techfore.2006.02.002. Rao, V. M. 1996. “Agricultural Development with a Human Face: Experiences and Prospects.” Economic and Political Weekly, A50–62. Rukmini, S. (2015, July 24) India’s new farm suicides data: myths and facts. The Hindu. Retrieved from: http://www.thehindu.com/data/indias-new-farm-suicides-data- myths-and-facts/article7461095.ece Sangha, Kamaljit Kaur. 2014. “Modern Agricultural Practices and Analysis of Socio-Economic and Ecological Impacts of Development in Agriculture Sector, Punjab, India - A

78

Review.” Indian Journal of Agricultural Research 48 (5): 331. doi:10.5958/0976- 058X.2014.01312.2. Sharma, O., & Haub, C. (2008, October). Examining Literacy Using India's Census. Population Reference Bureau. Retrieved from http://www.prb.org/Publications/Articles/2008/censusliteracyindia.aspx Singh, Poonam. 2014. “Declining Public Investment in Indian Agriculture after Economic Reforms: An Interstate Analysis.” Journal of Management and Public Policy 6 (1): 21. Tittonell, Pablo, A. Muriuki, Keith D. Shepherd, D. Mugendi, K. C. Kaizzi, J. Okeyo, L. Verchot, Richard Coe, and Bernard Vanlauwe. 2010. “The Diversity of Rural Livelihoods and Their Influence on Soil Fertility in Agricultural Systems of East Africa–A Typology of Smallholder Farms.” Agricultural Systems 103 (2): 83–97. Vasile, Andrei Jean, Cristian Popescu, Raluca Andreea Ion, and Iuliana Dobre. 2015. “From Conventional to Organic in Romanian Agriculture – Impact Assessment of a Land Use Changing Paradigm.” Land Use Policy 46 (July): 258–66. doi:10.1016/j.landusepol.2015.02.012. World Bank, 2016. Agricultural land (% of land area) and Agricultural land mass (%) Retrieved from: http://data.worldbank.org/indicator/AG.LND.AGRI.ZS Yadav, S. (2014, Aug 30) Agriculture Ministry cracks down on ‘mismanagement’ in KVKs. Indian Express. Retrieved from: http://indianexpress.com/article/india/india- others/agriculture-ministry-cracks-down-on-mismanagement-in-kvks/

79

Appendices

Appendix A: Sample recommendation reports from experiment

Figure A.1 Sample recommendation report provided to Control group

80

Figure A.2 Sample recommendation report provided to Treatment 1 group (page 1)

81

Figure A.2 (continued) Sample recommendation report provided to Treatment 1 group (page 2)

82

Figure A.3 Sample recommendation report provided to Treatment 2 group (page 1)

83

Figure A.3 (continued) Sample recommendation report provided to Treatment 2 group (page 2)

84

Appendix B: Sample survey sheet for data collection

Figure B.1 Sample survey sheet for data collection in actionability experiment (page 1)

85

Figure B.1 (continued) Sample survey sheet for data collection in actionability experiment (page 2)

86

Appendix C: Kernel regressions of actionability components to assess relationships with key explanatory variables

Figure C.1 Kernel regression outputs of ‘interpret’ on key explanatory variables

87

Figure C.2 Kernel regression outputs of ‘ease’ on key explanatory variables

88

Figure C.3 Kernel regression outputs of ‘acquire’ on key explanatory variables

89

Figure C.4 Kernel regression outputs of ‘afford’ on key explanatory variables

90

Appendix D: Statistical analyses of Actionability indices

Test Group Environment Farmer + Farmer + Treatment Treatment Control p-value Farmer Social Entrepren p-value 1 2 Network eur (1) (2) (3) (4) (5) (6) (7) (8) Index 1 0.6474 0.6585 0.5347 0.4796 0.5998 0.6356 0.6103 0.0179* Index 2 0.6507 0.6587 0.5326 0.3081 0.6158 0.6267 0.6038 0.003** Index 3 0.3189 0.3128 0.1617 0.000*** 0.2398 0.3203 0.2388 0.000*** Notes: This table reports the mean of actionability indices across test groups and environment groups, as well as results of a variance analysis across test groups and environments. p-values from the variance tests are noted in columns (4) and (8) respectively. Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘"’ 0.1 ‘ ’ Table D.1 Variation of actionability indices across Test Groups and Environments

Figure D.1 Variation of actionability indices with education level

91

Figure D.2 Variation of actionability indices with interpret score

Gender Typology Male Female p-value T1 T2 T3 T4 T5 p-value (1) (2) (3) (4) (5) (6) (7) (8) (9) Index 1 0.6365 0.4958 0.3151 0.7223 0.6938 0.6030 0.5353 0.4587 0.0005** Index 2 0.6333 0.5169 0.4572 0.6996 0.6641 0.6645 0.5363 0.4587 0.0381" Index 3 0.2875 0.1492 0.1975 0.3719 0.2771 0.2826 0.1705 0.1251 0.0016** Notes: This table reports the mean of actionability indices across gender and typology, as well as results of a variance analysis across gender and typology. p-values from the variance tests are noted in columns (3) and (9) respectively. Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘"’ 0.1 ‘ ’ Table D.2 Variation of actionability indices across gender and typology

92

Appendix E: Sample recommendations generated by recommendation engine

Figure E.1 Sample recommendation 1

93

Figure E.2 Sample recommendation 2

94

Figure E.3 Sample recommendation 3

95