a pr actical guide to measuring women’s and girls’ empowerment in impact evaluations

Rachel Glennerster*, Claire Walsh, Lucia Diaz-Martin

* The views expressed here are those of the authors and do not necessarily reflect those of the funders or the Department for International Development. ove rvie w

Impact evaluations can produce useful insights on how to design programs and policies that can increase women’s and girls’ empowerment and help us better understand the process of empowerment itself. Yet, it can be challenging to design a measurement strategy and to identify indicators that capture changes in empowerment, that are tailored to the local context, and that minimize reporting bias. Pulling insights from diverse disciplines and the experience of J-PAL affiliated researchers around the world, this guide offers practical tips for overcoming these challenges in impact evaluations. We emphasize the importance of conducting in-depth formative research to understand gender dynamics in the specific context before starting an evaluation, developing locally tailored indicators to complement internationally standardized ones, and reducing the potential for reporting bias in our instruments and data collection plan. We do not provide a single set of ready-to-go survey instruments; instead, we outline a process for developing indicators appropriate to each study along with extensive examples. In this way, we hope this guide can help provide researchers and practitioners the tools to select or develop their own indicators of empowerment that are right for their impact evaluations.

acknowledgements

We thank Christopher Boyer, Mayra Buvinic, Jan Cooper, Aletheia Donald, Sarah Gammage, Markus Goldstein, Seema Jayachandran, Hazel Malapit, Danielle Moore, Agnes Quisumbing, and Innovations for Poverty Action (IPA) who provided invaluable comments. All errors are our own.

location: . photo: libby abbot | j-pal/ipa cover location: rwanda. photo: tom gilks | j-pal/ipa tab le of conte nts

An introduction to this guide 1

When should we consider gender in measurement? 3

What is empowerment and when should we consider measuring it in an impact evaluation? 4

What are some challenges in measuring women’s and girls’ empowerment? 7

How can we build a reliable strategy for measuring empowerment in an impact evaluation? 11

Step 1. Formative research 12

Step 2. Theory of change, outcomes, and indicators 18

Step 3. Data collection instruments 30

Step 4. Data collection plan 41

Conclusion 45

povertyactionlab.org i an introduction to this guide

A growing number of policymakers are investing in learned that a general question like “how far away from women’s and girls’ empowerment—“the process by which your home can you travel alone?” would not accurately those who have been denied the ability to make strategic capture how their mobility was constrained. Many girls life choices acquire such an ability.”1 Policymakers have described being unable to travel outside the home by pursued empowerment as a policy goal in its own right themselves, yet they also mentioned regularly walking to and as a means to unlock greater development potential school alone or with friends. When we asked them about in low-and middle-income countries, and the number of this discrepancy, the girls responded—“but that’s different, programs and policies aimed at increasing empowerment that’s school!” has rapidly increased. As a result, many practitioners and researchers are grappling with how to more precisely Through these interviews, we soon understood that a girl’s measure something as complex as women’s and girls’ ability to travel alone depended on what she was doing empowerment in impact evaluations. and for whom. She could travel to and from school alone every day, but she could not attend local fairs on her own. Listening to women to learn about their aspirations and the To measure mobility as an indicator of empowerment, we barriers they face is a critical starting point. Good impact cared most about a young woman’s ability to go somewhere evaluations on empowerment start with rich discussions for an activity that only had value to her. In our survey, with women, girls, and other community members about then, we asked whether young women could travel alone what it means to be empowered in that specific context. to a list of common locations and activities, including some As part of the formative research that my research team that were for no one’s benefit but the girl herself. Yet, by and I (Rachel) conducted for a randomized evaluation on using a question and indicator that was so locally tailored, girls’ empowerment and child marriage in , we lost the ability to compare our findings with data on we interviewed young women about their daily lives, women’s mobility in other contexts. aspirations, and what they wanted to do that they were currently prohibited from doing.2 We also interviewed Researchers face many such trade offs and challenges in local NGOs and community partners, asking: “What does measuring empowerment. Which outcomes should we an empowered girl do differently than a girl who is not prioritize from a longer list of plausible outcomes? How empowered?” Some of the answers we heard again and should we measure women’s and girls’ ability to make choices again were, “she can go where she likes,” “she can negotiate when we rarely observe the decision-making process? Should with her parents,” “she is not the last to eat at home,” and we ask people about past decision-making, try to observe “she thinks differently about what girls can do.” a real choice, and/or assume some outcomes are good proxies for people’s ability to choose? There is already a Distilling these rich conversations into a set of indicators wealth of scholarship by economists, feminist scholars, that could be measured in a survey questionnaire was psychologists, anthropologists, sociologists, and practitioners not simple or straightforward. Many people we spoke to addressing these conceptual challenges. In this guide, we said that empowered girls had the ability to travel in their will highlight practical insights from these disciplines communities. Yet, when we talked to girls, we quickly that economists and others can use when measuring empowerment in quantitative impact evaluations.

1 Kabeer, Naila. 1999. “Resources, Agency, Achievements: Reflections on the Measurement of Women’s Empowerment.” Development and Change 30(3): 435- 464. Abstract. https://doi.org/10.1111/1467-7660.00125.

2 Buchmann, Nina, Erica Field, Rachel Glennerster, Shahana Nazneen, Svetlana Pimkina, and Iman Sen. “Power vs Money: Alternative Approaches to Reducing Child Marriage in Bangladesh, a Randomized Control Trial.” Working Paper, May 2017. The study was funded by the International Initiative for Impact Evaluation, the US National Institutes of Health, the Nike Foundation, and the International Development Research Center.

povertyactionlab.org 1 We will also dive deeper into the many practical challenges 4. How can we build a reliable strategy for measuring of measuring women’s and girls’ empowerment and how empowerment in an impact evaluation? to overcome them. We explore how to turn indicators Step 1. Formative research: conduct formative into good survey questions; how to test whether survey research to understand gender and empowerment respondents can understand our questions and answer them in the specific context accurately; how to decide whom to interview and when to get the best information; how to measure empowerment Step 2. Theory of change, outcomes, and indicators: in ways that are less subject to reporting bias; and how and map a theory of change to select appropriate outcomes when non-survey instruments can generate more reliable and indicators outcome measures than surveys. We also indicate how Step 3. Data collection instruments: develop and different instruments or approaches are likely to affect validate data collection instruments that minimize research costs. reporting bias Step 4. Data collection plan: design a data collection plan that minimizes measurement error who is this guide for? This guide is designed to support the work of monitoring We also include two in-depth appendices. Appendix 1 and evaluation practitioners, researchers, and students who provides examples of survey questions related to women’s are interested in learning how to measure women’s and girls’ and girls’ empowerment from J-PAL affiliated researchers’ empowerment in an impact evaluation. We primarily discuss randomized evaluations. Appendix 2 features examples of a strategies for developing good quantitative data collection range of different types of non-survey instruments that can instruments, including surveys and non-survey instruments, be used in quantitative analysis and tips for deciding when so this guide may be most relevant for practitioners and and how to use them. researchers interested in using these tools. Most of the examples are from randomized evaluations by J-PAL affiliated researchers and their co-authors that took place in low- or middle-income countries, but the underlying intuition may be relevant in high-income countries too. These insights may also be useful for program staff designing interventions focused on increasing women’s and girls’ empowerment, particularly when it comes to integrating monitoring and evaluation processes into program design.

what does this guide cover? We will share insights on the following key questions and topics: 1. When should we consider gender in measurement? 2. What is empowerment and when should we consider measuring it in an impact evaluation? 3. What are some challenges in measuring women’s and girls’ empowerment and how can we overcome them?

location: indonesia. photo: jurist tan | j-pal

povertyactionlab.org 2 1. whe n should we conside r ge nde r in measurement?

We should always consider gender in measurement in an Moreover, it is always important to consider gender in our impact evaluation, even when a program is not targeting analysis. Analyzing the overall impact of a program may one gender. For instance, who we choose to interview mask important gender dynamics that disaggregating our matters. A survey of heads of households, politicians, or analysis by gender could uncover. For example, a program business owners may miss women’s views if they are that provides fertilizer to farmers may also increase weeding underrepresented in these positions. Even within a household, responsibilities, a chore often done by women. A program different members may be more or less informed about could both improve educational attainment for children various aspects of family and economic life. It is important, overall and decrease the gender gap in school participation. then, to identify who has the information we want to collect It is critical to consider gender in the measurement and before we begin our survey. We also need to consider how analysis of any impact evaluation to uncover whether and the enumerator’s gender, ethnicity, and class may influence how the program affects people differently by gender. respondents’ answers, as these factors could affect who When planning our impact evaluation, we should consider consents to participate, and what individuals say. getting a large enough sample size to be able to detect any important gender differences. We should calculate and report impact estimates by gender in our analyses, along with whether the differences are statistically significant. We should also include a discussion of the main reasons why the program did or did not have differential effects by gender.

location: uganda. photo: alina xu | j-pal/ipa

povertyactionlab.org 3 2. what is empowerment and whe n should we conside r me asuring it in an impac t e valuation?

what is empowerment? Like all power relations, the process of empowerment There are many different definitions of empowerment, is also shaped by and interacts with the norms and but most of the seminal definitions emphasize agency and institutions (cultural, social, political, and economic) gaining the ability to make meaningful choices.3 Many that define an individual’s possibilities in a given context. definitions draw on Amartya Sen’s concept of an agent as These institutionalized “structures of constraint” shape “someone who acts and brings about change, and whose the choices available to women and girls at every step of 8 achievements can be judged in terms of her own values and the empowerment process. objectives.”4 In this guide, we primarily use Naila Kabeer’s definition of empowerment as “the process by which those For example, norms about women’s mobility in a place who have been denied the ability to make strategic life like Bangladesh can shape women’s resources like social choices acquire such an ability.” 5 Kabeer’s seminal “resources, capital. These norms can also affect women’s agency in agency, and achievements” framework also provides a terms of what decisions they are actually able to make, to practical intuition for measuring empowerment, which what extent they need to ask for permission, and what involves three interrelated dimensions (see Figure 1):6 kinds of decisions are actually empowering. For example, making decisions about household purchases may be Resources: gaining access to material, human, and social empowering in some contexts where women are denied resources that enhance people’s ability to exercise choice, this kind of choice. However, it may be disempowering to including knowledge, attitudes, and preferences other women who feel that taking charge of these decisions Agency: increasing participation, voice, negotiation, and is an additional burden rather than a choice. Norms can influence in decision-making about strategic life choices also affect women’s achievements. For instance, social norms about women and work shape whether women Achievements: the meaningful improvements in who want to start businesses or work outside the home well-being and life outcomes that result from increasing are able to do so. In our evaluations, we should consider agency, including health, education, earning opportunities, how these structures of constraint might limit the success rights, and political participation, among others7 of empowerment interventions by impeding people’s ability to translate resources into agency, and agency into Measuring indicators related to resources, agency, and achievements. They may also influence the extent to achievements over the course of an evaluation can be which gains for women in the household or private sphere an intuitive and practical way to measure the process translate to women’s collective improvements in the public of empowerment. sphere and vice versa.9

3 Malhotra, Anju, Sidney R. Schuler, and Carol Boender. 2002. “Measuring Women’s Empowerment as a Variable in International Development.” Background paper prepared for the Workshop on Poverty and Gender: New Perspectives, June 28, 2002, 6. http://siteresources.worldbank. org/INTGENDER/Resources/MalhotraSchulerBoender.pdf.

4 Sen, Amartya. 1999. Development as Freedom. New York: Alfred A. Knopf, 19.

5 Kabeer 1999, Abstract. 8 Kabeer, Naila. “Paid Work, Women’s Empowerment and Gender Justice: Critical Pathways of Social Change.” Working Paper, Institute of Development Studies, 6 Kabeer 1999, 437-438. January 2008, 24. https://www.ids.ac.uk/publication/paid-work-women-s- empowerment-and-gender-justice-critical-pathways-of-social-change. 7 Many outcomes, such as health and educational attainment, can be considered both resources and achievements. 9 Kabeer 2008.

povertyactionlab.org 4 2. what is empowerment and whe n should we conside r me asuring it in an impac t e valuation?

figure 1. visual representation of kabeer’s resources, agency, and achievements framework of empowerment

Meaningful Choice

Resources* Agency Achievements (preconditions) (process) (outcomes)

Examples: Examples: Examples: • Human capital • Voice • Education • Financial capital • Participation • Health & nutrition • Social capital • Decision-making • Income generation • Physical capital & assets

Structures of Constraint: Norms and institutions that vary by context and influence every step of the process

*Resources may also be achievements.

Citation: Kabeer, Naila. 1999. “Resources, Agency, Achievements: Reflections on the Measurement of Women’s Empowerment.” Development and Change, 30 (3): 435-464, Abstract. https://doi.org/10.1111/1467-7660.00125.

Kabeer, Naila. “Paid Work, Women’s Empowerment and Gender Justice: Critical Pathways of Social Change.” Working Paper, Institute of Development Studies, January 2008, 24. https://www.ids.ac.uk/publication/paid-work-women-s-empowerment-and-gender-justice-critical-pathways-of-social-change.

povertyactionlab.org 5 2. what is empowerment and whe n should we conside r me asuring it in an impac t e valuation?

when should we consider me asuring health, and political participation, the barriers and empowerment in an impact e valuation? opportunities women face in becoming more empowered We should measure empowerment in an impact evaluation in these various aspects will necessarily be different. It is when it is the primary or secondary objective of the therefore important that our outcomes, indicators, and program. Many programs specifically target women’s and data collection tools are tailored to the specific domain(s) 11 girls’ empowerment as the main outcome, like negotiation under study. skills programs for young women or trainings on women’s rights. Besides these clear cases, there are many programs for While it is important to use indicators specific to the which empowerment is a hoped for but secondary outcome, interventions and domains being studied, empowerment which is also key to measure. For instance, many financial measures tend to include a core set of concepts. Topics inclusion programs that increase access to credit, savings, often include women’s access to and control over resources and payments services aim to improve the performance of including income and assets; participation in important small enterprises, but also aim to increase women’s economic decisions at the personal, household, and community level; empowerment. It may also be important to consider control over reproductive health and fertility choices; measuring empowerment when there is a risk that a subjective well-being and happiness; mobility; time use program may disempower or have negative consequences and sharing domestic work; freedom from violence; for women. We may want to measure, for example, community and political participation; and well-being whether a program unintentionally adds to women’s or outcomes in domains like education, health, and labor.12 girls’ domestic duties, care work, or chores.10

We may also expect program impacts to vary among women according to how empowered they are. For 11 Many researchers emphasize the importance of using domain-specific example, access to free contraception may only increase measures of empowerment. Some useful examples include International Center for Research on Women’s (ICRW) modules for measuring economic contraceoption use when women have enough agency to empowerment (Golla, Anne Marie, Anju Malhotra, Priya Nanda, Rekha negotiate the ability to use it with their partners. In these Mehra, Aslihan Kes, Krista Jacobs, and Sophie Namy. 2011. “Understanding cases, it can be useful to measure an indicator of agency or and Measuring Women’s Economic Empowerment.” International Center for Research on Women. https://www.icrw.org/publications/understanding-and- decision-making power before implementing the program measuring-womens-economic-empowerment/) or International Food Policy in order to capture how impacts vary according to initial Research Institute’s (IFPRI) index for measuring women’s empowerment in agriculture (Alkire, Sabina, Ruth Meinzen-Dick, Amber Peterman, Agnes differences among women. Since empowerment programs Quisumbing, Greg Seymour, and Ana Vaz. 2013. “The Women’s Empowerment span many different domains, from finance to education, in Agriculture Index.” World Development 52: 71-91. https://doi.org/10.1016/j. worlddev.2013.06.007).

For more on the importance of domain-specific indicators see: Alkire, Sabina. 2005. “Subjective Quantitative Studies of Human Agency.” Social Indicators 10 We may find it useful in many cases to consider how social norms related Research 74 (1): 217–260. 236. https://doi.org/10.1007/s11205-005-6525-0. to gender roles in unpaid domestic work interact with the program being evaluated and how it affects women’s time use. The World Bank publication 12 Categories synthesized based on a review of the following publications: Alkire below on time use describes two measurement approaches: stylized questions et al. 2013; Golla et al. 2011; and time diaries. There are many other resources available on conceptualizing and measuring time use, time poverty, and unpaid domestic work, including Donald, Aletheia, Gayatri Koolwal, Jeannie Annan, Kathryn Falb, and Markus the Levy Institute’s resources and Marilyn Waring’s seminal book If Women Goldstein. “Measuring Women’s Agency.” World Bank Policy Research Working Counted below. Paper no. 8148, July 2017, 34-35. http://documents.worldbank.org/curated/ en/333481500385677886/Measuring-womens-agency; Seymour, Greg, Hazel Jean Malapit and Agnes R. Quisumbing. “Measuring Time Use in Development Settings (English).” World Bank Policy Research Laszlo, Sonia, and Kate Grantham. “Measurement of Women’s Economic Working Paper No. WPS 8147, July 2017. http://documents.worldbank.org/ Empowerment in GrOW Projects: Inventory and User Guide.” McGill curated/en/443201500384614625/Measuring-time-use-in-development-settings; University GrOW Working Paper, December 2017;

Levy Economics Institute of Bard College. n.d. “Publications on Time Poverty.” Lombardini, Simone, Kimberly Bowman, and Rosa Garwood. 2017. “A ‘How- Accessed March 19, 2018. http://www.levyinstitute.org/topics/time-poverty; To’ Guide To Measuring Women’s Empowerment: Sharing Experience from Oxfam’s Impact Evaluations.” Oxfam. https://policy-practice.oxfam.org.uk/ Waring, Marilyn. 1998. If Women Counted: A New Feminist Economics. publications/a-how-to-guide-to-measuring-womens-empowerment-sharing San Francisco: Harper & Row. experience-from-oxfams-i-620271.

povertyactionlab.org 6 3. what are some challe nges in me asuring wome n’s and girl s’ empowerment?

1. me asuring people’s abilit y to m ake multistep process, we need to select and measure important life choices is challenging short-term, intermediate, and final outcomes that can because we r arely observe decision- credibly make this causal link. For instance, say we want m aking directly. to measure whether a negotiation program empowers Empowerment is not just about changes in well-being; girls to complete secondary school. We will need to it is also about people’s agency in achieving these changes. measure their resources (such as their negotiation ability), Some of the most common options for measuring agency the decision-making process between parents and girls and decision-making power, including asking people about that determines whether they stay in school, and their how past decisions were made in a survey, are prone to ultimate educational achievement. reporting bias if not carefully designed and implemented. Moreover, since we can often only observe the outcomes of 3. m any aspects of empowerment are choices and not the real decision-making process itself, it susceptible to reporting bias. is hard to know whether changes in well-being are indeed the result of women’s increased ability to make choices Asking people about sensitive topics like gender attitudes, or not. For example, in Rachel and co-authors’ evaluation aspirations, reproductive health, contraception use, in rural Bangladesh, we found that more married women marriage, violence, and decision-making can lead to took on income-generating activities as a result of an reporting bias if our survey instruments are not well empowerment program, which initially appeared to be designed. Social desirability bias is one type of reporting an empowering labor market choice. Yet, in qualitative bias that is particularly challenging to mitigate when interviews with a subset of these women, we found that measuring empowerment. It occurs when respondents some were working out of severe economic necessity and give answers that they think the surveyor wants to hear many had limited autonomy in choosing their income- or that are in line with generally accepted social norms generating activity. Women were also often limited to rather than reality. Respondents’ stated preferences may working within the home.13 also be different from their revealed preferences—meaning that people can say one thing but do another. For example, parents may say that they believe girls should be given the 2. empowerment is a process. same education opportunities as boys in a survey, but they may not actually enroll all of their daughters in school. Unlike many other outcomes, empowerment is a continuous It takes a good deal of time and effort to design data process so it requires more effort and creativity on our collection instruments that mitigate social desirability part to measure it well. Many theories of empowerment bias and other types of reporting bias. emphasize the importance of conceptualizing it as a process, including Kabeer’s “resources, agency, achievements” framework. She argues that it is important to measure the resources that could enhance women’s ability to make choices, women’s agency (voice, participation, etc.) in making those choices, and the final changes in well-being that could result from increased agency.14 To measure this

13 Field, Erica, Rachel Glennerster, and Shahana Nazneen. 2018. “Economic Empowerment of Young Women in Bangladesh: Barriers and Strategies.” Women at Work: Addressing the Gaps. International Policy Centre for Inclusive Growth Policy in Focus 15 (1): 31-32. http://www.ipc-undp.org/ publication/28507.

14 Kabeer 1999, 437-438.

povertyactionlab.org 7 3. what are some challe nges in me asuring wome n’s and girl s’ empowerment?

4. empowerment me ans different things in 6. me asuring women’s preferences is different conte x ts, but we m ay also want challenging in conte x ts where women to compare across conte x ts. have internalized societ y’s vie ws. While the underlying concept of empowerment (increasing Power often operates invisibly through our everyday social one’s ability to make important life choices) is not specific institutions, norms, and habits. Women may internalize to a particular context, its concrete manifestations differ society’s views about their status and have preferences that by location. For example, a woman’s ability to sell goods reflect and accept inequality between men and women. in a local market may be a relevant constraint in parts of For example, fourteen percent of women responding to rural Bangladesh, but not in a context where this activity is a global ILO/Gallup poll in over a hundred countries common for women, like urban Mexico. Since the barriers responded that it is not acceptable for a woman in their women and girls face vary considerably by context, so must families to have a paid job outside the home if she wants the way that we measure empowerment, particularly in one.15 Researchers need to be careful not to impose an impact evaluations. If we do not include locally tailored outsider view of what women should want, and yet we indicators, we run the risk of failing to capture the changes also need to keep in mind that women’s preferences may that occur as a result of the intervention. Yet if we only reflect society’s views about gender rather than their own use locally-tailored indicators, we may lose the ability to true preferences. Even though women’s preferences are compare our findings with other studies on empowerment an important component of empowerment, measuring from different contexts or contribute our data to a meta- preferences alone may not always fully reflect women’s analysis, which are both useful for drawing broader lessons ability to make a meaningful choice. about effective empowerment programs.

7. disempowerment can heighten data 5. prioritizing outcome me asures collection challenges. is difficult. Related to the challenge above, when women lack power Empowerment spans nearly every realm of a woman’s and voice, it may be more difficult to collect data about life, making it challenging to decide what to focus on, their aspirations, opinions, and desires. In some cases, for particularly because changes in empowerment can be example, women may find it hard to answer questions unpredictable. However, selecting too many outcome about goal-setting or their plans for the future. It can measures may lead us to collect data we do not end also sometimes be challenging to find and hire women up using in our analysis, which also increases the cost enumerators to interview women. When working with of research and unnecessarily uses more of the study particularly vulnerable populations, like women who may participants’ time. Furthermore, with a large number of have experienced abuse or violence, we must be even outcomes, there is a likelihood that at least one indicator more thoughtful about the ethical implications of our will show a significant change just by chance rather than research and ensure that the consent process takes being a true impact of the program. We need to prioritize unequal power dynamics into account (see page 11 for the most relevant outcomes and to be realistic about what more information). a given program can change. There are many approaches to overcoming these seven challenges. Table 1 on the next page summarizes some key tips for overcoming them, along with references to sections in this guide with more information.

15 Gallup, International Labor Organization. 2017, “Towards a Better Future for Women and Work: Voices of Women and Men.” ILO-Gallup Report, March 8, 2017. 74. http://www.ilo.org/global/publications/books/WCMS_546256/ lang--en/index.htm.

povertyactionlab.org 8 3. what are some challe nges in me asuring wome n’s and girl s’ empowerment?

table 1. seven challenges in measuring empowerment and approaches to overcoming them This table lists approaches to overcoming seven common challenges in measuring women’s empowerment, along with the page numbers where you can find more in-depth information. Challenges one through five are addressed on pages21-29, while challenges six and seven are addressed in various sections which are noted in the table.

challenge approaches for overcoming the challenges page number

1. Measuring people’s ability Combine one or more of the following approaches: 21-25 to make important life choices is difficult because we rarely observe • Ask people about specific decision-making processes. decision-making directly. • For joint decisions, ask more than one person about the decision-making process.

• Measure women’s and men’s preferences at baseline and track whether outcomes move in the direction of women’s preferences after the program.

• Measure the psychological components of agency: women’s ability to set goals in line with what they value and act on them.

• Try to observe a choice directly or create a situation in which you can observe a choice.

• Measure fundamental outcomes related to well-being.

2. Empowerment is a process. • Track each major step along the causal chain: resources, 25-26 agency, and achievements.

• Use panel data to observe changes in the same people over time.

3. Many aspects of empowerment • When possible, complement indicators that are subject 27-28 are susceptible to reporting bias. to reporting bias with more objective indicators or proxy indicators.

• Triangulate an outcome using multiple indicators or perspectives when we don’t have the ideal measure.

• Frame the question indirectly by asking about a hypothetical situation.

• Consider using a non-survey instrument that is less 34-36, susceptible to reporting bias. Appendix 2

povertyactionlab.org 9 3. what are some challe nges in me asuring wome n’s and girl s’ empowerment?

challenge approaches for overcoming the challenges page number

4. Empowerment means different • Use findings from formative research to select or develop 12-17 things in different contexts, but locally tailored indicators and questions. we may also want to compare across contexts. • Pilot extensively in the field before launching a survey. 37-40

• Complement context-specific indicators of empowerment 28 with more standard ones.

5. Prioritizing outcome measures • Prioritize indicators that are related to the main objectives of 29 is difficult. the program (or can be expected as an important spillover), de-prioritize indicators that are not.

6. Measuring women’s preferences • Measure inequalities along with the extent to which these 25 is challenging in contexts inequalities are accepted. where women have internalized society’s views. • Measure changes in important life outcomes like health, 21, 25 education, or economic status that are widely accepted as signals of improved well-being.

7. Disempowerment can heighten • In addition to university IRB approval, seek feedback 11, 39, 42 data collection challenges. and approval from local IRB institutions, local leaders, and respondents’ family members if needed.

• Consider enumerator identity carefully, create private 41-43 spaces for interviews, and/or use technologies that allow respondents to answer privately.

povertyactionlab.org 10 4. how c an we build a re liab le str ategy for me asuring empowerment in an impac t e valuation?

In the rest of the guide, we will walk through lessons we box 1. developing a measurement str ategy have learned about how to measure women’s and girls’ empowerment during the different stages of an impact Step 1. Conduct formative research to understand evaluation. Box 1 lists the main steps that we cover for gender and empowerment in the specific context developing a good measurement strategy. While the steps are listed in approximate chronological order, in Step 2. Map a theory of change and use it to select appropriate outcomes and indicators reality we often work on several simultaneously. Before an evaluation begins, it is important to conduct formative Step 3. Develop and validate data collection research to understand gender and empowerment issues instruments that minimize reporting bias in a particular context prior to designing our evaluation Step 4. Design a data collection plan that minimizes and data collection instruments. Using our findings measurement error from this pilot research, we can map a theory of change for the intervention being evaluated, which will help us prioritize outcomes and indicators. Once we have decided If we are studying girls’ empowerment, we need to what to measure, we need to develop and pilot our data incorporate additional protections for working with children, collection instruments in communities similar to ones including getting approval from parents or guardians. The where the evaluation will take place. This is an important risks of participating in research may be elevated for women reality check to make sure our surveys and non-survey and girls when they are in marginalized positions, which instruments work in local conditions and pick up the requires careful consideration when developing research information we are interested in. Based on our piloting, approaches. For example, if we are asking women about we can design a data collection plan to minimize their experiences of violence, we should research laws for measurement error, including deciding where, when, mandatory reporting of violence, phrase survey questions and how often to gather data, and ensuring people are to minimize distress, consider how to protect survey staff comfortable speaking with our enumerators. if needed, and get guidance on if and under what conditions referrals for care and support should be given. Note that social science and medical ethics as well as national guidelines can ethical considerations differ on some of these questions (for example, referral to care) Before initiating any fieldwork, we must ensure that our and enumerators in many cases may not be qualified to make data collection plans meet international standards and medical judgments. For more in-depth guidance on measuring norms intended to protect research participants. In the domestic violence and intimate partner violence, see Annex academic context, our research design, data collection 1 of the WHO Multi-country Study on Women’s Health and instruments, and consent forms must be reviewed and Domestic Violence against Women and Innovations for approved by the Institutional Review Boards (IRBs) at our Poverty Action’s resource on conducting violence research.16 host academic institutions. In many cases, they must also For more information and resources on ethics in impact be reviewed and approved by local IRBs or government evaluations in general, see J-PAL’s ethics page.17 research councils in the countries where we are conducting research. Many research institutions outside academia have their own IRBs. As IRB requirements can vary, we should 16 García-Moreno, Claudia, Henrica A.F.M. Jansen, Mary Ellsberg, Lori Heise, check with our host institution to see what is required and Charlotte Watts. 2005. “WHO Multi-Country Study on Women’s Health before starting any interaction with human subjects, and Domestic Violence against Women: Summary Report of Initial Results on Prevalence, Health Outcomes and Women’s Responses.” Geneva: World Health even before the formative research phase. Organization, 101.

Innovations for Poverty Action. 2018. “The Safe and Ethical Conduct of Violence Research: Guidance for IPA Staff and Researchers.” Updated June 2018. https:// www.poverty-action.org/publication/ipv-ethical-guidance.

17 Abdul Latif Jameel Poverty Action Lab (J-PAL). N.d. “Ethics.” Research Resources. Accessed March 19, 2018. https://www.povertyactionlab.org/research-resources/ethics.

povertyactionlab.org 11 location: uganda. photo: cameron breslin | j-pal/ipa ste p 1. for m ative rese arch Conduct formative research to understand gender and empowerment in the specific context

We conduct formative research prior to starting an Whatever the methods we choose, it is critical to devote impact evaluation to better understand the local context, a significant amount of time to better understanding the underlying problem that the program is trying to local gender dynamics and the barriers to empowerment. address, and the rationale behind the proposed solutions. This should include analyzing resources, agency, and As part of this process, we should work closely with our achievements that women want, currently have, and do implementing partners to either carefully design and pilot not have. The findings from our formative research will the intervention, or to ensure that the program is being shape nearly every aspect of the evaluation design, from implemented as designed. It also helps us better formulate our theory of change to indicator selection, what questions the research questions for the evalution. Formative we ask, and whom to survey. Because it is so integral to research is a combination of field and desk research, both every step of the impact evaluation, it is not advisable to qualitative and quantitative, that helps us gain a deeper simply subcontract this work out to others. Collaborating understanding of empowerment and barriers to it in a with researchers who specialize in qualitative methods particular context.18 can also help improve the quality and rigor of our formative research. Good formative research creates repeated opportunities to listen to people living in the communities in which we are working. Some common formative research exercises include needs assessments or stakeholder analyses; common 18 Glennerster, Rachel, and Kudzai Takavarasha. 2013. Running Randomized Evaluations: A Practical Guide. Princeton: Princeton University Press, 66-81. research methods include semi-structured interviews, focus groups, and direct observation.19 19 Many economists that work on impact evaluations conduct this kind of formative research, some more and less formally, and they may use different names to describe these activities. In addition, some of this background research 18 Glennerster, Rachel, and Kudzai Takavarasha. 2013. Running Randomized is conducted prior to designing the intervention itself and can help shape the Evaluations: A Practical Guide. Princeton: Princeton University Press, 66-81. design of the program, as well as the research.

19 Many economists that work on impact evaluations conduct this kind of formative research, some more and less formally, and they may use different names to describe these activities. In addition, some of this background research is conducted prior to designing the intervention itself and can help povertyactionlab.org 12 shape the design of the program as well as the research. ste p 1. for m ative rese arch

what are some com mon for m ative a program that seeks to increase women’s participation rese arch e xercises? in local politics might help us identify the possibility of Needs assessments and stakeholder analyses are formal opposition from certain groups of community members tools we can use to systematically gather and analyze that could undermine the success of the program. We might information on gender dynamics and barriers to consider adding additional survey modules measuring empowerment in a specific context. It is not necessary these community members’ reactions as a result. Kammi to structure our formative research process in this Schmeer’s “Stakeholder Analysis Guidelines” and Bryson manner, but it can be helpful if we are working in a new and Quinn Patton’s “Analyzing and Engaging Stakeholders” context and need to gain an in-depth understanding are useful resources for conducting a stakeholder analysis.23 of these issues.20 For an impact evaluation on empowerment, we can use A needs assessment21 documents the gap between the these frameworks to assess what resources are available current and desired state of the world for a particular or denied to women or girls, where they lack the ability issue by gathering information from many different to make life choices that are important to them, what stakeholders through interviews, focus groups, direct choices they would make if they could, and what changes observation, and in some cases existing or new surveys. It are possible. It is also important to assess how gender helps us hone in on the key needs or barriers people face in norms and constraints may interact with the program solving their problems and how the program implementers being tested. have in mind could be designed to address them. It can also be particularly useful for understanding the nature, consequences, and potential drivers of problems in order how do we collect data to link them to potential solutions. The World Bank’s in for m ative rese arch? free book A Guide to Assessing Needs is a helpful resource We should use a combination of field and desk research for conducting a needs assessment.22 during this stage. We can start by reading the existing literature on gender in our context and consulting other A stakeholder analysis uses data from interviews, implementers and researchers with expertise in the study focus groups, and direct observation to document the context. In addition to reading work from anthropology, relative power and interest of the various actors involved history, and/or sociology, it is also useful to read descriptive in a particular issue. It is another tool that can help us quantitative studies or other impact evaluations from the understand and map gender dynamics relevant to our same context. evaluation, as well as important actors to include in our evaluation. For example, a stakeholder analysis for Listening to the perspectives of people in communities similar to those where we will conduct the evaluation is the most important element. We typically use direct observation, semi-structured interviews, and/or focus groups to collect these kinds of data. Some researchers 20 Needs assessments and stakeholder analyses typically occur as inputs for also use immersion exercises and/or participatory program or intervention design. Thus, if we are evaluating an existing program, we may be able to get this information from our implementing partners. However, if we are designing new treatment arms or cannot access these preexisting analyses, it may be helpful to structure our formative research in this way.

21 This can also be called a situation analysis, landscape analysis, or need statement, 23 Bryson, John M., and Michael Quinn Patton. 2010. “Analyzing and Engaging among other terms. Stakeholders.” Handbook of Practical Program Evaluation Fourth Edition: 36-61. https://experts.umn.edu/en/publications/analyzing-and-engaging-stakeholders. 22 Watkins, Ryan, Maurya West Meiers, and Yusra Laila Visser. 2012. “A Guide to Assessing Needs: Essential Tools for Collecting Information, Making Decisions, Schmeer, Kammi. 2000. “Stakeholder Analysis Guidelines.” Policy Toolkit and Achieving Development Results.” Washington, DC: World Bank. http:// for Strengthening Health Sector Reform 2: 1-43. http://www.who.int/ hdl.handle.net/10986/2231. workforcealliance/knowledge/toolkit/33.pdf.

povertyactionlab.org 13 ste p 1. for m ative rese arch

research methods. Collaborating with trained qualitative perceive as signs that a woman is empowered. The most researchers as co-investigators can be valuable for applying frequent answers to these questions can help inform the the necessary rigor in the data collection and analysis indicators we select to measure resources, agency, and process. However, we should not subcontract this work achievements. There is no magic number for how many out to others, as there is simply no substitute for spending interviews to conduct; the goal is to repeat them until time in the field ourselves. they no longer yield new meaningful information. Oxfam’s “Conducting Semi-Structured Interviews” We only give a brief summary of common qualitative provides some useful guidance on carrying out semi- methods here, as there are many existing in-depth resources structured interviews.25 by researchers and practitioners with expertise in qualitative methods, including Michael Quinn Patton’s Qualitative Research & Evaluation Methods and FHI360’s free field guide on qualitative methods.24

Direct observation is critical and enables us to get a more in-depth picture of issues and opportunities related to women’s and girls’ empowerment in a particular context. For example, if there are groups of women that regularly meet in the community, a representative from the research team may be interested in observing several of these meetings to get detailed information about the issues important to them.

Semi-structured interviews allow for open-ended discussion, enabling new ideas or issues to emerge that we may not have previously thought of. They are one-on- one interviews that follow a general script of open-ended questions but allow flexibility to ask impromptu follow-up questions. We should strategically interview people from different parts of the community so that we receive diverse perspectives. In addition to helping identify the barriers women face, we can also use semi-structured interviews to location: philippines. photo: agnese carrera | j-pal/ipa generate a list of qualities that people in the community

24 Mack, Natasha, Cynthia Woodsong, Kathleem M. MacQueen, Greg Guest, and Emily Namey. 2005. “Qualitative Research Methods: A Data Collector’s Field Guide.” Research Triangle Park: Family Health International. https://www. fhi360.org/sites/default/files/media/documents/Qualitative%20Research%20 Methods%20-%20A%20Data%20Collector%27s%20Field%20Guide.pdf.

Patton, Michael Quinn. 2015. Qualitative Research & Evaluation Methods: Integrating Theory and Practice Fourth Edition. Thousand Oaks, California: SAGE Publications. 25 Raworth, Kate, Caroline Sweetman, Swati Narayan, Jo Rowlands, and Adrienne Hopkins. 2012. “Conducting Semi-Structured Interviews.” Oxfam. https:// Hennink, Monique, Inge Hutter, and Ajay Bailey. 2011. Qualitative Research policy-practice.oxfam.org.uk/publications/conducting-semi-structured- Methods. Thousand Oaks, California: SAGE Publications. interviews-252993.

povertyactionlab.org 14 location: kenya. photo: thomas chupein | j-pal

Focus groups are semi-structured interviews with box 2. participative r anking groups of people led by trained moderators. We can methodology (pr m) use them to understand how communities or societies think about a particular issue in general, like women’s PRM uses a ranking process that can be used to gather access to education or health services, rather than getting data on how members of a community understand an individual’s opinion. We should keep in mind that a specific concept and how they view the relative 26 charismatic or powerful individuals can shape or drown importance of the categories used to explain it. Researchers formulate a key research question, such out others’ opinions. For this reason, it is important to as “how can you tell if a girl under the age of eighteen complement focus groups with individual interviews. We is doing well in your community?” A trained moderator can use our discretion to decide how many focus groups gathers a small focus group of roughly seven to fifteen to conduct to get the right amount of information. We participants and leads a discussion on the question may find that five are enough if we start to hear the same until the group agrees on approximately five to eight answers over and over again, or we may want to implement categories and ranks them in order of importance. more if each one is still yielding new and interesting The focus groups are repeated with different groups of information. Just like interviews, focus groups should be community members. Researchers can then analyze how often each category was chosen across groups conducted with different kinds of people in order to get and use the most frequently cited categories to develop different perspectives on the same issue. Participative locally tailored indicators. Ranking Methodology (PRM) is one kind of focus group methodology that may be useful in developing locally grounded indicators of empowerment (see Box 2 for more details).

26 Ager, Alastair, Lindsay Stark, and Alina Potts. 2010. “Participative Ranking Methodology: A Brief Guide: Version 1.1.” New York, NY: Program on Forced Migration & Health, Mailman School of Public Health, Columbia University. https://doi.org/10.13140/RG.2.2.34356.45448.

povertyactionlab.org 15 ste p 1. for m ative rese arch

Immersion visits: Immersion visits are a more in-depth In some cases, the formative research itself can change form of direct observation. In an immersion visit, researchers people’s opinions or views. For example, asking people and/or practitioners travel to live in a community similar to talk about topics they don’t normally discuss in focus to where the evaluation will take place for a short period groups or interviews might shift how they think about of time. See Box 3 for an example of an immersion visit these topics. We conduct formative research in areas from a program and evaluation in , led by outside but similar to where the evaluation will take the implementing partner that designed the program, place to avoid introducing an unintended intervention Development Media International (DMI). in our evaluation sample.

box 3. case study: using immersion visits to infor m questionnaire design

Rachel, Joanna Murray, and Victor Pouliquen, are working behaviors and knowledge across our intervention with Development Media International (DMI) to conduct areas. These are partially reflected in the local a randomized evaluation measuring the impact of DMI’s languages—in three of the local languages we mass media campaigns on family planning and gender studied, there were different words to describe norms in Burkina Faso. DMI sent staff (who were mostly family planning, birth spacing, and contraception. based in the capital Ouagadougou) to visit different rural By contrast, in one of the language groups the parts of the country for week-long immersion visits to same word ‘Maapedi/ Mapè’ was used for all gain a better understanding about how women and men sof these concepts.”27 think and talk about family planning. In these visits, DMI discovered critical information that helped the research DMI’s in-depth work revealed many new insights team and DMI develop locally tailored and culturally that helped shape the survey questionnaires for the appropriate mass media campaigns, indicators, and evaluation. First, it helped identify the right words survey questions: to use for “family planning” in the different local language versions of the survey. Second, it helped “In a country where only 11% of women in rural us realize that we needed to collect data on the areas use modern contraceptives, we assumed prevalence of certain myths about contraceptives that awareness about contraception methods in the evaluation because this could be driving low would be fairly low. Our research showed that use. DMI’s immersion visits taught us which myths almost everyone knew of at least one modern were prevalent in these villages. Based on this method—yet there were a lot of myths about the formative research, our baseline questionnaire asked side effects. Rather than just encouraging people respondents if they believed that birth control pills, to use contraception methods and educating them implants, or injections could make a woman sterile or about the different options available, we will need cause sickness to see if these beliefs were driving low to address those myths in our campaign…We also use and if they changed over time. found some major differences in

27 Development Media International. 2016. “Sending Our Writer to Live in a Village for a Week.” DMI News, May 23, 2016. http://www.developmentmedia. net/news/sending-our-writers-to-live-in-a-village-for-a-week.

povertyactionlab.org 16 ste p 1. for m ative rese arch

synthesis and analysis of qualitative data m ain takeaways on for m ative research Once all the information is gathered, researchers and practitioners can work together to code and analyze the • Our measurement strategy is only as good as data. We can then summarize the current state of women’s our understanding of gender dynamics and empowerment in a particular domain, the biggest barriers empowerment in the local context. Spending women face in becoming more empowered, and how significant time conducting formative research in the field will improve our theory of change, the program being designed could potentially address outcome indicators, and survey questions. them. It is also beneficial to present the findings to some of the original respondents to get their feedback on our • Formative research is a combination of field conclusions and make sure they accurately reflect their and desk research that uses methods like opinions. We should also document and save the data and semi-structured interviews, focus groups, analysis from our formative research in a place and format and direct observation. that is easy to access and use throughout the evaluation. • In an evaluation on empowerment, our goal is to use these tools to understand what resources quantitative analysis of e xisting are available or denied to women or girls, where or ne w surve y data they lack the ability to make life choices that are important to them, what choices they would make In addition to these qualitative research exercises, it is useful if they could, and what changes are possible. to examine data from representative surveys. This can help validate our qualitative findings and assess if they apply to • Collecting qualitative information from many the broader population. For example, in the qualitative different sources, including women and girls, interviews for Rachel and co-authors’ evaluation on young men and boys, NGOs, and other community women’s empowerment in Bangladesh, many community stakeholders and comparing our qualitative findings to new or existing representative members and partners stated that menstruation was a major surveys can help us validate our findings. reason why girls miss school. Yet when we conducted our baseline survey in several hundred villages, we found that • Formative research can also help us consider how very few girls missed school due to their periods. Instead, influential people and organizations should be the most common reason for missing school was that the incorporated into our data collection strategy. teacher was not present during class time.

We don’t necessarily need to collect new data to do this type of analysis. We can often find the information we need, such as the prevalence of particular issues or problems, in publicly available datasets from nationally representative surveys. These include surveys from government statistics bureaus, or USAID’s Demographic and Health Surveys, which include modules on everything from women’s 28 Demographic and Health Surveys Program. N.d. The DHS Program. status and empowerment, to family planning and domestic Accessed March 19, 2018. https://www.dhsprogram.com. violence, among others, from many low- and middle- 28 DemographicEmpowerment.” and The Health DHS Surveys Program. Program. Accessed N.d. March “Women’s 19, 2018. Status https:// and income countries. Empowerment.”dhsprogram.com/Topics/Womens-Status-And-Empowerment.cfm. The DHS Program. Accessed March 19, 2018. https:// dhsprogram.com/Topics/Womens-Status-And-Empowerment.cfm. Demographic and Health Surveys Program. N.d. “Family Planning.” DemographicThe DHS Program. and Health Accessed Surveys March Program. 19, 2018. N.d. https://dhsprogram.com/ “Family Planning.” TheTopics/Family-Planning.cfm. DHS Program. Accessed March 19, 2018. https://dhsprogram.com/ 28 Demographic and Health Surveys Program. N.d. The DHS Program. Topics/Family-Planning.cfm. Accessed March 19, 2018. https://www.dhsprogram.com. Demographic and Health Surveys Program. N.d. “The Gender Corner.” The DemographicDHS Program. and Accessed Health MarchSurveys 19, Program. 2018. https://dhsprogram.com/topics/ N.d. “The Gender Corner.” The DHSgender-Corner/index.cfm. Program. Accessed March 19, 2018. https://dhsprogram.com/topics/ Demographic and Health Surveys Program. N.d. “Women’s Status and gender-Corner/index.cfm.

povertyactionlab.org 17 location: india. photo: nikita taniparti | j-pal ste p 2. theory of change, outcomes, and indic ators Map a theory of change to select appropriate outcomes and indicators m apping a theory of change In an evaluation about empowerment, we can use a theory The next step is to use the findings from our formative of change to map changes in women’s resources, agency, research to refine our theory of change for the program and achievements, which often correspond to short-term, being tested. To identify good outcomes (the change intermediate, and final outcomes. Box 4 shares an example or impact we expect to see) and indicators (observable of a logical framework summarizing the theory of change signals we use to measure that change), we need a deep for an evaluation of a government policy instituting quotas understanding of the pathways through which the program for women leaders in village councils in India, in which can affect people’s lives. A theory of change framework researchers measured changes in women’s resources, provides a structured approach to mapping these potential agency, and achievements. For a more detailed guide on pathways using the findings from our formative research, building a theory of change, see Chapter 5.1 of Running 29 along with relevant theory or lessons from completed Randomized Evaluations: A Practical Guide or J-PAL’s 30 impact evaluations. It documents a program’s logical research resources on Measurement and Data Collection. chain of results—from the inputs to the outputs to the intermediate and final outcomes—along with the indicators to track each major step along the chain. We can also use it to specify the necessary assumptions to get from one step to another and the possible risks that could break the causal chain. It is beneficial for researchers and implementers to jointly develop and 29 refine the theory of change for a program, even when Glennerster and Takavarasha 2013, 180-190. they are part of the same organization. 30 Abdul Latif Jameel Poverty Action Lab (J-PAL). N.d. “Measurement & Data Collection.” Research Resources. Accessed March 19, 2018. https://www. povertyactionlab.org/research-resources/measurement-and-data-collection.

povertyactionlab.org 18 ste p 2. theory of change, outcomes, and indic ators

box 4: building a theory of change to measure the impact of women leaders in india

In the early 1990s, a constitutional amendment in India The researchers used several innovative indicators called for a random one-third of village council leader, for measuring outcomes related to empowerment. or pradhan, positions to be reserved for women. The Rather than rely on self-reported measures of women’s village council, which encompasses between five and participation in village council meetings, they used fifteen villages, is responsible for the provision of minutes of the village meetings to count how many local infrastructure—such as public buildings, water, women spoke up. They analyzed records of the requests and roads—and for identifying government program submitted to the village council by women and men to beneficiaries. The village council is required to organize identify whether women’s preferences for investments two village meetings per year, during which they in local public goods differed from those of men. They present their proposed budget and report on their also tracked how women’s preferences varied by state. activities in the previous six months. This classification allowed them to measure whether councils’ ultimate investments better reflected women’s J-PAL affiliated researchers Raghabendra Chattopadhyay preferences when leadership positions were reserved (Indian Institute of Management) and Esther Duflo (MIT) for women. Results from the evaluation suggest that took advantage of the random assignment embedded in reservations for female leaders affected policy decisions the reform to study the impact of mandated representation in ways that better reflected women’s preferences.31 of female policymakers on the provision of social services between male- and female-led village councils. Table 2 shows a logical framework depicting the policy’s theory of change. In short, if quotas for women leaders are implemented and more women made chairs of their village council, then women would become more involved in council policy discussions, and the council’s investment decisions will better reflect women’s preferences. Finally, the quality of the public goods preferred by women would increase.

31 Chattopadhyay, Raghabendra, and Esther Duflo. 2004. “Women as Policy Makers: Evidence from a Randomized Policy Experiment in India.” Econometrica 72(5): 1409-1443. https://doi.org/10.1111/j.1468-0262.2004.00 539.x. Read a full summary of the study at: https://www.povertyactionlab.org/ evaluation/impact-women-policy-makers-public-goods-india.

povertyactionlab.org 19 ste p 2. theory of change, outcomes, and indic ators

box 4 continued: building a theory of change to measure the impact of women leaders in india

Table 2. Logical framework summarizing the theory of change of a policy to increase women’s participation in village councils in India.32

description indicator assumptions/risks

inputs Quotas for women are passed Passage of legislation Supreme court mandate in state legislature is translated into effective legislation at the state level

outputs There are more women Number of women leaders in Quota legislation is resources leaders in village councils council chair positions implemented as designed (gram panchayats) in villages

outcomes Political participation Number of women speaking at Seeing women leaders general village council meetings emboldens women to speak up at meetings. agency

impacts Public goods investments Types of public goods Women’s preferences for public more closely match mentioned in women’s goods differ from men’s. women’s priorities queries vs. men’s queries The system is democratic Number of public goods enough to respond to an of different types increase in women’s queries

Repairs to public goods Responses to political pressure by type from women will impact what gets repaired. Recently built public goods

achievements by type New investments will be more in line with women’s needs than older investments.

The quality of public goods that Reduced presence of microbes Greater investment in women’s are priorities for women improves in drinking water priority areas will improve quality of services.

32 Glennerster and Takavarasha 2013, 189.

povertyactionlab.org 20 ste p 2. theory of change, outcomes, and indic ators: challe nges and tips Tips for selecting outcomes and indicators for empowerment that can help address common measurement challenges

There are many challenges in measuring empowerment. 1. Measuring fundamental outcomes: Some outcomes The following section outlines some tips to address are so fundamental to well-being that we often assume the challenges related to the selection of outcomes they reflect people’s status and decision-making ability. and indicators. For example, if women are more malnourished than men, this likely reflects women’s lack of choice and agency. Measuring objective outcomes like this and tracking challenge 1: me asuring people’s abilit y to changes over time can help avoid reporting bias, yet m ake important life choices is difficult because this approach is most relevant in resource-poor settings we r arely observe decision-m aking directly. or situations of severe poverty. For example, in contexts where malnutrition is not prevalent, it is likely not a good Five approaches to identifying outcomes indicator of whether women have choice. and indicators that capture agency and decision-making 2. Asking people about decision-making processes: When it comes to measuring empowerment, the outcomes A more direct way to measure choice is to ask people related to agency and decision-making are often the most about the decision-making process itself. The most challenging to select. How can we measure the ability commonly used survey modules, often adapted from to make a choice when we only observe the outcomes USAID’s Demographic and Health Survey (DHS), ask of choices and not the process itself? There are several women and men general questions about who makes ways researchers try to tackle this challenge: measuring decisions in their household. For example: “Who usually fundamental outcomes and tracking change over time, makes decisions about [healthcare for yourself]/[major asking people about specific decision-making processes, household purchases]/[visits to your family or relatives] measuring women’s and men’s preferences to see if you, your husband/partner, you and your husband jointly, 33 outcomes move in the direction of women’s preferences, or someone else?” measuring psychological aspects of women’s abilities to set goals in line with their values and act on them, and Asking this kind of general question is useful for comparing observing choices directly. Each option has strengths participation in household decision-making across contexts. and weaknesses, and combining two or more can help While internationally standardized metrics are valuable, we overcome their individual limitations. In the ideal scenario, should be thoughtful about what kinds of indicators will we can measure the full process: tracking whether a realistically pick up changes in our context. Asking more program changes resources available to women, if this specific questions about a concrete scenario tailored to leads to changes in women’s agency, and finally if it also the choices women care most about in our study context results in changes in well-being. may be easier for them to answer accurately, and may tell us more about whether they can make choices that matter to them. For example, we could ask, “if your child is sick and needs immediate health care, but your husband is not home, what would you do?” Or, “if you ever need medicine for yourself (for a headache, for example), could you go buy it yourself?” In the evaluation Rachel and coauthors

33 Demographic and Health Surveys Program. 2017. “DHS Model Questionnaire – Phase 7.” The DHS Program. w-64 row 922. Accessed March 19, 2018. https:// dhsprogram.com/pubs/pdf/DHSQ7/DHS7-Womans-QRE-EN-07Jun2017- DHSQ7.pdf.

povertyactionlab.org 21 ste p 2. theory of change, outcomes, and indic ators: challe nges and tips

conducted in Bangladesh, our team got different answers Until there is more data on concrete ways to improve when we asked the specific versus general questions decision-making questions, combining these kinds of about health care decisions, implying that the general questions with more objective outcome indicators (like decision-making questions were not picking up on the education, health, income, etc.) or asking questions about characteristics we expected by themselves.34 the same concept in two different ways (e.g., asking about a concrete scenario and decision-making in general) can help A fruitful area for future measurement research is to us determine whether changes in general decision-making conduct more validation exercises comparing different indices are meaningful. methods for asking about complex concepts like decision- making. More systematic investigations could help us Many researchers and practitioners have also created useful determine whether it is worth making improvements domain-specific survey modules for decision-making, like or additions to current widely used questions. A recent International Center for Research on Women (ICRW)’s example is an analysis of DHS data from twenty countries modules for measuring economic empowerment and by World Bank researchers, which finds women’s responses IFPRI, USAID, and The Oxford Poverty & Human to the general DHS decision-making questions are correlated Development Initiative (OPHI)’s index for measuring with several other empowerment indicators.35 Another women’s empowerment in agriculture (WEAI).37 Oxfam’s useful example from International Food Policy Research “A ‘How To’ Guide to Measuring Women’s Empowerment” Institute (IFPRI) compares responses to decision-making provides guidance on constructing a women’s empowerment questions with questions about autonomy and makes the index that includes decision-making and can be applied in case for calibrating questions to specific contexts.36 different domains.38

3. Comparing gender preferences to changes in outcomes: We can ask women and men about their preferences before an intervention and track whether the 34 In response to the standard decision-making question, sixteen percent of outcomes we observe after it move more in the direction women said they usually make decisions about their healthcare alone or jointly with their husbands. Given this response, we would call this group of women’s preferences. For example, in the study on women more empowered—yet nearly a quarter of this group also said they could policymakers in India, researchers found that women’s not take a sick child to the doctor until their husbands came home. We also preferences for investments in local infrastructure were found discrepancies in the other direction: Over half of the women who appeared disempowered according to the standard question said that they different than men’s, and tracked new infrastructure could take a sick child to the doctor on their own, and even more telling, could projects in communities to see whether they moved to be buy medicine for themselves. For more see, Glennerster, Rachel, and Claire more in line with women’s preferences.39 This approach is Walsh. 2017. “Is It Time to Re-Think How We Measure Women’s Household Decision-Making Power?” Abdul Latif Jameel Poverty Action Lab, September 7, useful because it explicitly measures what women want, 2017. https://www.povertyactionlab.org/blog/9-6-17/it-time-rethink-how-we- a key part of the definition of empowerment, along with measure-women’s-household-decision-making-power-impact. The 2017 book Measuring Women’s Economic Empowerment: Critical Lessons from South America also how their preferences relate to changes in outcomes. Yet it describes scenarios in which standardized indicators failed to reflect women’s only works if men and women have different preferences. decision-making constraints (Martínez-Restrepo, Susana, and Laura Ramos- In an evaluation in , for example, Rachel and Jaimes (eds.). 2017. Measuring Women’s Economic Empowerment: Critical Lessons from South America. Springfield, VA: IDRC, Fedesarrollo. http://hdl.handle. co-authors found no systematic differences in gender net/11445/3482). preferences for the uses of community driven development

35 Donald et al. find that women who report having greater sole or joint-decision making power were also more likely to own land, work outside the home, earn more than their husbands, and not condone domestic violence—outcomes we typically think of as signs of empowerment. Donald et al. 2017, 34-35. 37 Golla et al. 2011; Alkire et al. 2013. 36 Seymour, Greg, and Amber Peterman. 2017. “Understanding the Measurement of Women’s Autonomy: Illustrations from Bangladesh and Ghana.” IFPRI 38 Lombardini et al. 2017. Discussion Paper No. 1656. https://www.ifpri.org/publication/understanding- measurement-womens-autonomy-illustrations-bangladesh-and-ghana. 39 Chattopadhyay and Duflo 2004, 1411.

povertyactionlab.org 22 ste p 2. theory of change, outcomes, and indic ators: challe nges and tips

grants, so this approach was not an option. Furthermore, borrowers as measured by a depression index.40 We can measuring preferences can be difficult if women’s preferences also try to measure a woman’s ability to: 1) set goals change over time or if people’s stated preferences reflect to achieve things she values; 2) perceive that she can current social norms more than they reflect their true achieve these goals; and 3) act on these goals. One tool preferences (see Box 5 on page 25). It is useful to combine for measuring the ability to set goals that are aligned with this approach with intermediate outcomes measuring one’s own preferences and values is the Relative Autonomy women’s participation in decisions in order to draw the Index (RAI). It measures whether a person believes their causal link between participation in decisions and changes actions are driven by their own goals or by external factors in outcomes. like social norms or coercion.41 Locus of control scales quantify the degree to which a person thinks life events are 4. Measuring the psychological aspects of agency: caused by their own behavior or external factors, and may We can also use tools from psychology to try to measure be useful for measuring whether people perceive that they the psychological components of agency. For example, have the capacity to achieve their goals. Self-efficacy scales in Mexico, Manuela Angelucci (UT Austin) and J-PAL ask people about their confidence in completing actions affiliated researchers Dean Karlan (Northwestern) and are another popular way to measure whether people and Jonathan Zinman (Dartmouth) found that access perceive they can achieve their goals.42 Many surveys also to microcredit reduced depression among potential just ask people about whether they feel they have freedom of choice.43 To measure a person’s ability to act on their goals, we can use some of the other approaches we have already discussed, such as measuring their participation in decision-making and outcomes related to well-being.

One note of caution is that people in different contexts may find that the questions from tools like the RAI, locus of control, or self-efficacy scales don’t make sense to them or are difficult to answer. Before adding them to our questionnaire, we should be sure to first extensively pilot, validate, and if needed, adapt them to make sure they work in our context. For an in-depth discussion of the pros and cons of these measures, see the recent World Bank paper “Measuring Women’s Agency.”44

40 Angelucci, Manuela, Dean Karlan, and Jonathan Zinman. 2015. “Microcredit Impacts: Evidence from a Randomized Microcredit Program Placement location: mexico. photo: alejandra rogel | j-pal/ipa Experiment by Compartamos Banco.” American Economic Journal: Applied Economics 7 (1): 151-82. 174. http://dx.doi.org/10.1257/app.20130537.

41 Donald et al. 2017, 7.

42 Donald et al. 2017, 17.

43 Donald et al. 2017, 20.

44 Donald et al. 2017.

povertyactionlab.org 23 location: sierra leone. photo: glenna gordon | j-pal/ipa

5. Observing choices directly: The last approach is to level and record how men and women’s preferences are observe women and men making choices directly, either incorporated into the final decision. For example, researchers in a real-world setting or through a game or structured created a proxy measure of women’s bargaining power in community activity. These methods are beneficial because a marriage by offering women a choice of getting a slightly by observing actions directly we do not have to rely on smaller cash transfer delivered directly to her or getting people’s reports of whether they participated in decisions. a slightly larger cash transfer delivered to her husband.46 This type of direct observation can occur when decision- While these approaches avoid reporting bias, they tend to making is taking place in large or small group settings or be more expensive to implement than surveys and require between individuals. For example, we could count how additional enumerator training to do well. See Appendix 2 many times women speak up in community meetings and for more details on these and other quantitative non-survey see if it increases after the intervention takes place. instruments and when they may be appropriate to use.

We could also create a scenario in which people have to make a decision during a survey and examine whether women exercise more agency. For example, in an evaluation of a community-driven development program in Sierra Leone, Rachel and co-authors gave communities the choice of two different thank-you gifts after we finished surveys. Enumerators recorded whether and how women participated in this community decision and whether they had an effect on which option was ultimately chosen.45 We could also prompt a similar decision at the household

45 Casey, Katherine, Rachel Glennerster, and Edward Miguel. 2012. “Reshaping 46 Almås, Ingvild, Alex Armand, Orazio Attanasio, and Pedro Carneiro. Measuring Institutions: Evidence on Aid Impacts Using a Preanalysis Plan.” The Quarterly and Changing Control: Women’s Empowerment and Targeted Transfers.” Journal of Economics 127 (4): 1755-1812. https://doi.org/10.1093/qje/qje027. National Bureau of Economic Research Working Paper No. w21717, 2015.

povertyactionlab.org 24 ste p 2. theory of change, outcomes, and indic ators: challe nges and tips

challenge 2: empowerment is a process box 5: some challenges in measuring preferences Tip 1: Track each major step along the causal chain along with changes over time. In contexts with high inequality between men and women, women may internalize their society’s views Measuring both intermediate and final outcomes can help that they are of lower status, which may be reflected us make a credible claim that changes in women’s outcomes in the preferences they share in surveys. For example, are the result of their increased agency. For example, in fourteen percent of women stated in a recent global the evaluation of reservations for women in village councils ILO Gallup poll that it is not acceptable for a woman in India, it would have been more difficult to claim that in the family to have a paid job outside the home if she the increase in spending on drinking water and road wants one.47 It is important to measure inequalities in basic needs and opportunities along with the extent infrastructure was the result of women’s increased political to which these inequalities are accepted. We need participation without evidence that women voiced more to be careful not to impose an outsider view of what opinions related to these public goods in village council women should want, yet we also have to be careful meetings.50 We can also capture steps in the causal chain by that women may be reflecting society’s view of them measuring changes in women’s resources, agency, and rather than their actual preferences. In these cases, we achievements (see Table 3). The United Nations Foundation’s may not want to use preferences as a way to help us guidance on Measuring Women’s Economic Empowerment, measure agency. for example, identifies direct, intermediate, and final outcomes to prioritize.51 Another challenge with measuring preferences to understand agency is that empowerment programs often try to change women’s preferences. The Collecting panel data that tracks changes in the lives of the empowerment literature discusses the ability to same people over time can also help us better measure the “imagine the possibility of having chosen differently” process of empowerment. We can also first ask women as an important dimension of meaningful choice and about their goals and plans, and measure progress over empowerment.48 This implies women recognizing time. For example, we asked young women in Bangladesh that there is another way that they could be treated about their education and income-generating goals for the and that women may be able to change their own future and tracked changes in these outcomes over time. situations. Economists tend to think about this using the idea of changing preferences, but the challenge is However, given that the empowerment process itself can that it is hard to make clear statements about welfare lead women to reshape their plans and visions of the future, when people’s preferences change.49 One way to our interpretations should be flexible. In Bangladesh, for avoid this issue is to decide that meaningful choices example, we spoke to one woman who described how are ones that lead to changes in important outcomes having a job in the city changed some of her perceptions: like health, education, and income, as discussed rural society felt confining when she returned home from above. We could then measure both changes in working in Dhaka. women’s ability to make a choice and changes in important outcomes that correspond to those choices.

50 Chattopadhyay and Duflo, 2004.

51 Buvinic, Mayra, and Rebecca Furst-Nichols. 2013. “Measuring Women’s Economic Empowerment: Companion to A Roadmap for Promoting Women’s Economic Empowerment.” United Nations Foundation and Exxon Mobil 47 Gallup, International Labor Organization 2017, 31. Foundation. http://womeneconroadmap.org/sites/default/files/Measuring%20 Womens%20Econ%20Emp_FINAL_06_09_15.pdf. 48 Kabeer 1999, 441. For more information on the roadmap, see the website: United Nations 49 Grüne-Yanoff, Till, and Sven Ove Hansson, eds. 2009. Preference Change: Foundation and Exxon Mobile Foundation. 2013. “Women’s Economic Approaches from Philosophy, Economics and Psychology. Dordrecht: Springer Empowerment: A Roadmap.” Last accessed March 20, 2018. http:// Science & Business Media. womeneconroadmap.org/.

povertyactionlab.org 25 ste p 2. theory of change, outcomes, and indic ators: challe nges and tips

table 3. measuring the process of empowerment by measuring resources, agency, and achievements

A good way to measure the process of empowerment is to measure women’s access to resources to make a decision, their agency and participation in the decision itself, and the final outcomes of these choices in terms of well-being. The table below summarizes how we attempted to do this when we measured young women’s empowerment related to reproductive health in our evaluation on child marriage and empowerment in Bangladesh.

concept ex ample indicator(s)

resources Knowledge Knowledge about different forms of contraception, risks of early marriage and teen pregnancy, and availability of health services

Preferences Preferred number of children and spacing between births

Access to health care Past use of health services, use of contraception

agency Negotiation Talks to spouse about contraception

Decision-making Has a say in decisions about contraception (whether and which type to use)

achievements Child marriage Age of marriage

Reproductive health Age of first birth, maternal morbidity

povertyactionlab.org 26 ste p 2. theory of change, outcomes, and indic ators: challe nges and tips

challenge 3: m any aspects of empowerment Tip 2: Triangulation can help capture outcomes are susceptible to reporting bias that are challenging to measure or susceptible to reporting bias. Tip 1: When possible, complement indicators that are subject to reporting bias with more objective When we don’t have one ideal indicator, it can be helpful indicators or proxy indicators. to use more than one metric. One survey question cannot give us a comprehensive understanding of whether a woman People may not feel comfortable speaking freely in has a say in household spending decisions, for example. response to survey questions about many topics related Combining responses from several survey questions into to women’s empowerment. Measuring a more objective indices or families of indicators can help when we are or proxy indicator of the behavior in addition to gathering trying to capture different aspects of an overall concept or self-reported data can help us mitigate the consequences cross-validate self-reported responses. This can also help of this reporting bias. Say we want to examine the effect when a more objective or proxy indicator is not available. of a program on sexual behavior, sexually transmitted For example, in a randomized evaluation of commitment infection (STI) incidence, and decisions about childbearing. savings accounts in the Philippines, J-PAL affiliated researchers STI tests are far more reliable than self-reported data, but Nava Ashraf (London School of Economics), Dean Karlan they are also more expensive. The incidence of childbearing (Northwestern University), and Wes Yin (UCLA) specified among young women can be used as a proxy for the women’s influence in household spending decisions as their incidence of unprotected sex, and childbearing is much main outcome of interest. They collected data on two easier to observe objectively than sexual behavior. In Uganda, indicators: 1) an index of women and men’s responses about to measure whether an empowerment program increased who decides in nine common household spending decisions young women’s control over decisions about sex, marriage, and, 2) household expenditures on what respondents and having children, J-PAL affiliated researchers Oriana identified as typically “male” or “female” goods.53 Using Bandiera (London School of Economics), Robin Burgess indices or outcome families can also help prevent us from (London School of Economics), Imran Rasul (University cherry picking a single indicator that shows a significant College London), and co-authors used more objective impact by chance during analysis. However, indices have indicators, such as incidence of teen pregnancy, early costs too. They force us to give relative weights to our marriage, and cohabitation. They complemented these various indicators. In the above example, using an index indicators with self-reported measures, such as the share required researchers to put an implicit weight on women’s of girls reporting having sex against their will and the ages 52 influence in decisions about spending at the market and at which they aspire to get married and/or have children. spending on children’s schooling. Another great way to triangulate is to ask different people the same question and compare their answers.54

53 Ashraf, Nava, Dean Karlan, and Wesley Yin. 2010. “Female Empowerment: Impact of a Commitment Savings Product in the Philippines.” World Development 38 (3): 333-344. 336. https://doi.org/10.1016/j.orlddev.2009.05.010.

Ashraf, Nava, Dean Karlan, and Wesley Yin. 2006. “Household Decision Making and Savings Impacts: Further Evidence from a Commitment Savings Product in the Philippines.” Paper Presented at Yale University Economic Growth Center, New Haven, CT, June 2006, 29. https://ideas.repec.org/s/ egc/wpaper.html. 52 Bandiera, Oriana, Robin Burgess, Markus Goldstein, Niklas Buehren, Selim Gulesci, Imran Rasul, and Munshi Sulaiman. “Women’s Empowerment in 54 For example, see Donald et al. 2017, 39 or Ambler, Kate, Cheryl Doss, Caitlin Action: Evidence from a Randomized Control Trial in Africa.” World Bank Kieran, and Simone Passarelli. 2017. “He Says, She Says: Exploring Patterns Group Working Paper, July 2017. http://documents.worldbank.org/curated/ of Spousal Agreement in Bangladesh.” International Food Policy Research en/707081502348725124/Women-s-empowerment-in-action-evidence-from- Institute Discussion Paper 01616, March 2017. http://ebrary.ifpri.org/cdm/ a-randomized-control-trial-in-Africa. ref/collection/p15738coll2/id/131097.

povertyactionlab.org 27 ste p 2. theory of change, outcomes, and indic ators: challe nges and tips

Tip 3. Frame the question indirectly by asking box 6. ask about a hypothetical about a hypothetical situation. situation in a vignet te If we anticipate that survey respondents might find it J-PAL affiliated researcher Seema Jayachandran hard to answer certain questions about their own lives (Northwestern University) and co-authors Diva Dhar and families, we can try framing the questions differently. (Bill & Melinda Gates Foundation) and Tarun Jain One option could be to pose the question by asking, “for (Indian School of Business) are evaluating the impact someone similar to you in your community…” Another of a school-based program in India promoting gender equity on students’ attitudes about gender and fertility option is to use a vignette to describe a hypothetical decisions later in life. To measure attitudes about decision or scenario a fictional person is facing and ask marriage and women’s labor market participation, about that. See Box 6 for an example of this type of vignette. researchers included a vignette to ask respondents’ their opinions on a hypothetical decision:

challenge 4: empowerment me ans different Pooja, a 21-year-old girl belongs to a village in Haryana. things in different conte x ts, but we m ay Since childhood, she has aspirations of becoming a also want to compare across conte x ts police officer. After graduating from college, she appears for the Haryana police examination and is offered a job Tip: Complement context-specific indicators as a police officer. Her parents are worried about her job as they think that is not suitable for a woman. They of empowerment with internationally also believe that it is her age to get married and they standardized ones. have found a prospective groom for her from a good In general, it can be useful to select commonly used family. Pooja, however, wants to take up the job and outcomes and indicators to facilitate comparison with does not wish to get married. According to her parents, Pooja would not need to work after she gets married other empowerment research. However, indicators that are as her husband will take care of her. Pooja should, valid measures of empowerment in the local context are instead, focus on household work, help out her mother likely to give us a more accurate and precise measure of in law and eventually have children. Finally, her parents empowerment than standardized outcomes that are not decide that instead of taking up the job, she should get tailored to our context. For example, women’s formal labor 55 married. Do you agree with the parent’s decisions? market participation is a standard measure of empowerment in many middle- and high-income countries. However, this The vignette was tied to the specific context and indicator may not be sensitive to changes in empowerment designed to resemble a decision that respondents’ in rural communities in many low-income countries where relatives or family friends may have faced. While the researchers also asked questions about general there are few formal labor market opportunities for women attitudes (e.g. “should women be allowed to work or men. In these contexts, we may want to ask about outside the home?”), the vignette may potentially informal work. In addition, certain kinds of work may be elicit more honest or nuanced views because of the disempowering depending on the working conditions, so concreteness and complexity of the setup. we may also want to consider job quality along with asking women about whether they want to work and how and why they entered their current jobs.

55 Dhar, Diva, Tarun Jain, and Seema Jayachandran. “Intergenerational Transmission of Gender Attitudes: Evidence from India.” NBER Working Paper No. 21429, July 2015.

povertyactionlab.org 28 ste p 2. theory of change, outcomes, and indic ators: challe nges and tips

challenge 5: prioritizing outcome m ain takeaways for theory of change, me asures is difficult outcomes and indicators Tip: Prioritize indicators that are related to the • We use the findings from our formative research to main objectives of the program, de-prioritize create a theory of change and select indicators that indicators that are not. are relevant to barriers women and girls face in a One program is not likely to empower a woman in every particular context. realm of her life. It is important to be realistic about what a • To measure agency, we can try to observe choices program can change and to prioritize a few main outcomes directly. When this isn’t possible, we can ask about for which there is a strong logical connection to the the decision-making process, compare gender program being evaluated according to a well-articulated preferences to changes in outcomes, and/or theory of change. For example, if we are evaluating a measure psychological aspects of goal-setting program trying to increase women’s access to and use and agency. of family planning services in Burkina Faso, it may not • We can measure the process of empowerment by be relevant to include a series of questions on household selecting indicators that track the major steps in the consumption, unless household consumption is an integral causal chain, including short-term, intermediate, and part of our plan for analysis. If we want to pick up where final outcomes, which can correspond to changes effects could spill over into other domains, we should in women’s resources, agency, and achievements. use our theory of change to identify the specific areas • We should think strategically about using locally that might experience change and prioritize measuring tailored indicators along with internationally those. Beyond prioritizing outcomes, it can help to state standardized ones. them in terms of the specific attitudes, behaviors, and/ or achievements we expect to see change. Incorporating • It can be useful to measure more than one indicator too many indicators opens up the risk that some of the for outcomes in cases where one indicator doesn’t indicators will change just by chance, compromising the tell the full story or in cases where our indicators are susceptible to reporting bias. validity of our findings. Thus, it is critical to be thoughtful about how many metrics are necessary to capture changes • We can partially overcome the challenge of social in empowerment. desirability bias by asking about a hypothetical scenario or complementing subjective indicators with objective ones or proxy outcomes.

povertyactionlab.org 29 location: uganda. photo: javier merelo de barberå llobet | j-pal/ipa ste p 3. data collec tion instruments Develop and validate data collection in instruments that minimize reporting bias

Once we have selected our outcome measures and household.56 Another potential concern is that respondents indicators, the next step is to select, develop, and may not be able to accurately recall the past. They may pilot our data collection instruments. also misreport information on potentially sensitive topics, including gender attitudes, family planning, or how decisions collecting data with surve ys are made at home. For example, in Rachel and coauthors’ Surveys are the most common method for collecting data evaluation on reducing child marriage in Bangladesh, the in an impact evaluation using quantitative analysis. They research team found that since dowry was illegal, we had are good for collecting a large amount of data from an to approach survey questions on the topic very carefully to individual on a range of topics. Relative to many non- get an accurate response. First, we asked, “Often in weddings, survey instruments, they can cover a lot of ground quickly gifts are exchanged between the two families. Did you give and usually at a lower cost. However, it is important to any gifts to your daughter’s in-laws household?” Then, if remember that for some data points, surveys may produce they said yes, we asked what kinds of gifts they gave. If 57 unreliable information. Mayra Buvinic (United Nations they mentioned cash, we asked how much. Foundation) and Ruth Levine (Hewlett Foundation) have written about how surveys have the potential to minimize the role of women when the questions themselves have a gender bias—for example, when survey instructions guide surveyors towards assuming that a man is the head of

56 Buvinic, Mayra, and Ruth Levine. 2016. “Closing the Gender Data Gap.” Significance 13 (2): 34-37, 35.

57 Buchmann et al. 2017.

povertyactionlab.org 30 ste p 3. data collec tion instruments

To help address the kinds of challenges common to and age of first birth. Since many of the girls had grown surveys, we can complement subjective indicators with up and left their home communities by the time we objective ones, consider using non-survey instruments, conducted the follow-up survey, it was more efficient to or give respondents options for answering questions collect this information from the parents who remained in privately. We can also ask the question in different the home communities. We also learned that girls typically ways or ask different people. came back home to visit their mothers for the first births, so mothers had fairly accurate information about whether and when the daughters had given birth. selecting respondents Before designing our instruments, it is important to Collecting information from both spouses in a household determine who our respondents will be. Important can be useful for examining bargaining power related to considerations include: Who is the target population of household spending decisions, family planning, and the program? Who knows the information we want to investment in children’s education and health. Interviewing collect? Who is unlikely to manipulate the information? men and women separately can also reveal how husbands Who can we get the largest amount of information from and wives’ preferences differ. For example, J-PAL affiliated at once? Should we only survey women or girls, or will researchers Mushfiq Mobarak (Yale University) and Grant it be valuable to gather data from other members of the Miller (Stanford University) measured the demand for households or communities? Who knows what piece of nontraditional cookstoves in Bangladesh. In the evaluation, information is likely to vary by context? For example, in a team of two enumerators visited each household and Rachel and co-authors’ evaluation on girls’ empowerment developed a way to measure household decision-making. in Bangladesh, the research team surveyed girls’ parents While one enumerator interviewed the husband, the about whether the girls were married, the age of marriage, other conducted a separate interview with the wife. After completing the survey, either the husband or wife was given the opportunity to order a stove that directed smoke away from the cook, but was not able to consult his or her spouse before making the decision. They found that women had much higher demand for health- protecting cookstoves, but that they lacked authority to make purchases.58

Asking more than one person the same question can serve as a check to the validity of the main respondents’ answers—for example, by asking both parents and young women about why they are missing school. In addition, it can be a useful way to identify systematic differences in perceptions. For example, Aletheia Donald, Gayatri Koolwal, and Markus Goldstein (World Bank) along with Jeannie Annan and Kathryn Falb (International Rescue Committee) found that husbands’ and wives’ responses to DHS survey questions about who makes decisions about

58 Miller, Grant, and A. Mushfiq Mobarak. “Gender Differences in Preferences, Intra-household Externalities, and Low Demand for Improved Cookstoves.” location: india. photo: thomas chupein | j-pal NBER Working Paper No. 18964, April 2013. http://www.nber.org/papers/ w18964.pdf.

povertyactionlab.org 31 ste p 3. data collec tion instruments

large purchases and use of the husband’s earnings differ may not be very familiar with the term “empowerment.” It systematically within the same household across many could therefore be risky to ask a question like, “do you feel countries.59 Asking husbands about their wives’ preferences empowered to make decisions about how much to spend and seeing how closely they match her own (and vice versa) on your children’s schooling in your household?” Instead can provide a measure of the flow of information between we could ask, “who decides how much to spend on your spouses. In Rachel and co-authors’ evaluation in Bangladesh, children’s schooling in your household?” We may find that we used questions about how well husbands understood we never use the term empowerment once in our survey their wives’ preferences as a signal of better marriages. measuring empowerment.

Clearly framed. The best questions have clear boundaries turning indicators into good and a well-defined time frame. If the responses are multiple surve y questions choice, the list of possible answers should be mutually Well-constructed survey questions can go a long way in exclusive—no two responses should overlap in meaning mitigating measurement error and reporting bias. Good in a way that could confuse respondents. They should also survey questions are easy to answer, pick up variation, be collectively exhaustive—no frequent answers should minimize the risk of social desirability bias, have realistic be missing from the list. For example, we may find that recall periods, and tailored to the local context. They women respond “no” to the question, “have you visited the should also be specific, neutral, understandable, clearly doctor in the past month?” but report that they visited framed and relevant. other kinds of health care providers that we are interested in documenting. Asking “have you visited the doctor, a Specific. Each question should only ask one thing at a time. nurse, health clinic, or healer in the past month?” could For example, we may not want to ask a young woman, “at capture a fuller picture of the health services women what age do you want to get married and have a child?” use. Second, any question asking a respondent to recall Instead, we should ask two separate questions, one about something from the past should include a clear time desired age of marriage and the other about desired age frame and this recall period should be short enough that for having a child. the respondent can remember accurately. For example, instead of asking, “how many times did you take your child Neutral. The wording of the question should not bias to the doctor in the past year,” we could ask, “have you respondents to give us a particular answer one way or taken your child to the doctor in the past month?” If she another. For example, we would never ask the following responds yes, we can then ask, “how many times have you question because it could lead to biased answers, “don’t taken your child to the doctor in the past month?” you think that women can be good village council leaders?” We could instead ask, “who is your village council leader?” and then follow up with, “how effective a leader do you think he/she is?” and see how perceptions of effectiveness correlate with the leader’s gender.

Understandable. All survey questions should be relatively easy to comprehend by anyone in our sample. They should not contain unfamiliar terms or concepts that are not clearly defined in the survey prompt. Generally speaking, we should avoid abstract concepts. For example, in many cases, women

59 Donald et al. 2017, 39.

povertyactionlab.org 32 ste p 3. data collec tion instruments

Relevant. It is important to be respectful of study Appendix 1 features a catalogue of examples of survey participants’ time, so all of our survey questions should questions and modules related to women’s and girls’ measure indicators that will actually be used in our analysis. empowerment from completed impact evaluations by Several studies have shown that the accuracy people’s J-PAL affiliated researchers. Another useful resource is answers can be affected by survey fatigue, which we can the University of California San Diego’s Evidence-Based avoid by being judicious in the number of questions we Measures of Empowerment for Research on Gender include.60 Our questions should also be relevant to the Equality (EMERGE) website, which compiles survey local context. Say we want to examine women’s access questions that have been used to measure gender to technology. We might ask about internet use in some equality and empowerment along with guidelines on contexts but ask about cell phone ownership and usage developing metrics.61 in others depending on the availability of services. We can sort out these details during the formative research phase through semi-structured interviews, participant other ways to mitigate reporting bias observation, and spending time in the communities. We In addition to writing good questions, we can design the can also double-check this information when pretesting survey to make respondents feel more comfortable and our survey instruments. therefore perhaps more willing to answer accurately. It is ideal to start the questionnaire with simpler questions, such The information we gather during our formative research as basic demographic information. Once the enumerator and can also help us write better questions. It can help us select respondent have had the opportunity to build a rapport, the right words from the local language(s) to use, identify we can begin to ask questions about more sensitive topics. sensitive words or topics to avoid, and figure out the most For highly sensitive or personal topics, it is better to have appropriate way to ask about a particular topic. women enumerators interview women and men enumerators interview men. In some cases, we can ask respondents to enter their answers to sensitive questions privately and anonymously into a computer, phone, or tablet device, so that fear of judgment by the enumerator has less of an influence on their responses. We can also mitigate reporting bias by comparing self-reported answers to data from other sources. If we are interested in learning whether a woman redeemed a voucher for a family planning consultation at a local health clinic in the past month, for example, we can ask her directly and also check the administrative data from the clinic.

60 Krosnick, Jon A., and Stanley Presser. 2010. “Question and Questionnaire Design.” In Handbook of Survey Research, edited by Peter V. Marsden and James D. Wright, 263-315. Bingley: Emerald Group Publishing, 291-292.

61 The University of California San Diego’s Evidence-Based Measures of Empowerment for Research on Gender Equality (EMERGE) website is another useful resource. 60 Krosnick, Jon A., and Stanley Presser. 2010. “Question and Questionnaire The website compiles survey questions that have been used to measure gender Design.” In Handbook of Survey Research, edited by Peter V. Marsden and James D. equality and empowerment along with guidelines for selecting metrics. Wright, 263-315. Bingley: Emerald Group Publishing, 291-292. Evidence-Based Measures of Empowerment for Research on Gender Equality (EMERGE) University of California San Diego. 2018. Accessed June 19, 2018. location: colombia. photo: paul smith | j-pal/ipa Emerge.ucsd.edu.

povertyactionlab.org 33 location: india. photo: jasleen kaur | j-pal

quantitative non-surve y instruments to empowerment. Appendix 2 contains a catalogue with examples of non-survey instruments that could be useful Non-survey instruments that allow for quantitative in evaluations about empowerment, along with their pros analysis, including direct observation, games, experimental and cons, and tips on when to use them. It covers direct vignettes, implicit association tests, and more, can offer observation and structured community activities, implicit more objective measures of some outcomes than surveys.62 association tests, vignettes, list randomization, purchase They can also be helpful for quantifying things that are decisions, games, social interaction and network effects, difficult to measure. This includes attitudes or activities participatory research methods, and biomarkers. that participants may not report honestly in a survey (e.g., prejudice, attitudes about gender norms), are highly subject to recall error (e.g., how many times women spoke in a community meeting), or things about themselves that respondents may not even be aware of (e.g., subconscious gender bias). Non-survey instruments are more complicated for enumerators to administer than surveys and tend to cost more. See Box 7 for two examples of non-survey instruments used to measure an indicator related

62 Some of these instruments, such as vignettes and list randomization, can be implemented in a survey context. We include them in this section because they require different processes and analyses than standard survey questions.

povertyactionlab.org 34 ste p 3. data collec tion instruments

box 7. ex a mples of t wo non-survey instruments

Games: Having study participants play a game can I have with me Ksh 700 for you to divide. You can keep any help measure qualities like altruism, cooperation, and amount from Ksh 50 to Ksh 650 for yourself, and send the trust. They are useful when we want to test theories balance to your spouse. What you decide for yourself, I will of how people will respond to various incentives and put in the tin, hidden in this envelope. I will put the amount scenarios, or categorize people into different groups you decide for your spouse...into your spouse’s tin. based on their behavior in the game. However, games are only approximations of decisions in the real world, and they may not reflect what people would do in scenarios with bigger stakes or when they are not being observed.

In a randomized evaluation in Kenya, J-PAL affiliate Simone Schaner (Dartmouth College) examined how differences in spending preferences between husbands We will ask you and your spouse to come to an agreement and wives affected demand for savings accounts.63 The over how to divide the Ksh 700. After you make a decision, research used a game to measure intra-household I will put the amount intended for you into your tin and... the bargaining power to test whether the savings account amount for intended for your spouse into your spouse’s tin. I programs had different impacts on women with [also] have here a bag with all the possible money amounts, more or less bargaining power. At the end of the ranging from Ksh 50 to Ksh 650. I will ask you to select one survey, husbands and wives, who were being surveyed of the envelopes in this bag and put it into your tin. This is the “secret-keeping” choice. separately, were asked to divide a small cash prize between themselves and their spouse. Each spouse recorded his or her allocation separately on cards and placed the amount they allocated to themselves in their tin and the amount allocated to the spouse in the spouse’s tin. Then they came together to decide how to allocate the prize money between them and recorded it on cards, which they added to each of their tins. To ensure respondents’ privacy, the husband and wife also each added an additional envelope to each of their tins with a randomly selected amount of Once you have added this final amount, you will be asked to pick one of the envelopes from the tin and whichever envelope money. The husband and wife then chose one card you draw, you will receive that amount today in cash. from each of their tins and were immediately given the cash amount allocated to them on the card they chose. This game allowed researchers to identify women with relatively low- or high-bargaining power and test whether the impact of the intervention was different for women with higher or lower bargaining power. Women with larger differences between their individual and joint preferences for allocating the money were classified as having relatively lower bargaining power.

63 Schaner, Simone. 2017. “The Cost of Convenience? Transaction Costs, Schaner, Simone. “Bargaining Game Script: The Cost of Convenience? Transaction Bargaining Power, and Savings Account Use in Kenya.” Journal of Human Costs, Bargaining Power, and Savings Account Use in Kenya.” Research. Accessed Resources 52 (4): 919-945. https://doi.org/10.3368/jhr.52.4.0815-7350R1. June 8, 2018. https://sites.google.com/site/sschaner/home/research.

povertyactionlab.org 35 ste p 3. data collec tion instruments

box 7. ex a mples of t wo non-survey instruments, continued

Experimental Vignettes: Vignettes are brief 1. A short recording of a speech by a local leader responding descriptions of hypothetical scenarios. They can to a complaint from a villager is played. Respondents heard be useful in many different aspects of research, from the recording spoken by either a man or woman. reducing the risk of social desirability bias in survey questions about sensitive topics to clarifying the meaning of concepts in our survey questions. Experimental vignettes—in which study participants are randomly assigned to hear one of multiple versions of the same story with a key detail changed—can also be used to measure subconscious biases or prejudices based on gender, race, ethnicity, or other factors. To measure gender bias, for example, we could create two identical versions of a vignette in which only the gender of the subject changes. We would then randomly assign study participants to hear one of the two versions and ask them the same follow-up questions about the vignette, allowing us to isolate the difference in responses caused solely by the gender of the subject in the hypothetical scenario. Typically, only one detail in the vignette can be changed to isolate the source of bias.

J-PAL affiliated researchers Lori Beaman (Northwestern University), Raghabendra Chattopadhyay (Indian 2. After the speech, respondents were asked to rate the Institute of Management), Esther Duflo (MIT), and effectiveness of the leader. Rohini Pande (Harvard University) along with Petia Topalova (International Monetary Fund) evaluated whether exposure to female leaders in Indian village councils changed perceptions about women’s effectiveness as leaders.64 As part of the survey, researchers played a short recording of a speech by a local leader responding to a complaint from a villager. Respondents were randomly assigned to hear the recording spoken by a man or woman. After the speech was over, they were asked to rate the leader’s effectiveness. This vignette allowed researchers to measure whether there was a subconscious bias that led people to rate female leaders as relatively less effective. They found that exposure to a female leader through the policy that reserved village council head positions for women reduced men’s bias against female leaders.

64 Beaman, Lori, Raghabendra Chattopadhyay, Esther Duflo, Rohini Pande, and Petia Topalova. 2009. “Powerful Women: Does Exposure Reduce Bias?.” The Quarterly Journal of Economics 124, no. 4: 1497-1540.

povertyactionlab.org 36 ste p 3. data collec tion instruments

incorporating qualitative rese arch They also reported community pressure, with family Research teams often conduct semi-structured interviews and neighbors asking, “why isn’t your daughter married and focus groups with small subsets of the larger evaluation yet?” Together, these discussions helped us think about sample to gain a more in-depth understanding of a particular marriage decisions differently—as a search process with phenomenon or generate hypotheses about why we observe asymmetric information and signaling. Maybe dowries certain behaviors and how and why a program may have increased with age because grooms were concerned that changed behavior. This does not mean asking “How did there was a (negative) reason why some older girls were not this program change your life?”. Such questions suffer married yet. Maybe the community stopped asking “why is from social desirability bias; but more importantly, people your daughter not married yet?” after the in-kind transfer can find it hard to construct their own counterfactual program because there was now a clear financial reason of what their lives would have been like without the (the in-kind transfers) to delay. Researchers are now program. Instead, qualitative and quantitative researchers testing these hypothesis in the field. These conversations working together can use these interviews to form or also helped explain why the impact of the program did investigate hypotheses and mechanisms. For example, in not end when the program did. Once the in-kind transfers Rachel and co-authors’ evaluation in Bangladesh, the team stopped, it would still take some time to find the right of researchers is made up of several economists and a match, and the girls did not all get married immediately qualitative researcher Shahana Nazneen. We conducted when they were eighteen and ineligible to continue in-depth qualitative interviews with sub samples of young receiving the transfers. women, their parents, and local matchmakers. This helped us better understand how decisions about young women’s marriage are made and why a girls’ empowerment program pretesting and piloting logistics for data did not affect child marriage rates but an in-kind transfer collection instruments to unmarried adolescents up to age eighteen did. Once we have designed our instruments, the crucial next step is to validate them with respondents from the Our quantitative survey showed that a woman’s decision communities in which our evaluation will take place, but on when and whom to marry was nearly always made by outside of the actual sample for our evaluation. We will her parents, and the empowerment program did not change only know whether we are asking the right questions, that. The theory of change behind the empowerment whether the questions make sense to the people we will program was that either parents did not know the dangers be interviewing, and whether they are picking up the right of child marriage and would be informed by their daughters, information after we ask people. Researchers can validate or that parents were not taking their daughters’ interests their data collection instruments by pretesting their survey into account and that daughters would be better able questionnaires and non-survey instruments and piloting to bargain with their parents after the empowerment data collection logistics. Pretesting and piloting responses program. However, our qualitative interviews showed should not be included in our final data analysis. that parents did know the dangers and agonized over the decision. Many reported not wanting to marry their 1. Pretesting survey questionnaires involves implementing daughters as young as they did, but felt they were doing practice interviews or survey prompts with respondents what was best for their daughters. Marriage offers came similar to people in our sample. It can help us check how infrequently and parents reported taking offers because the target population interprets the survey questions. Are they worried they might never again receive an equally the questions easy to understand and answer? Are there good offer with a dowry they could afford, as dowries any confusing words that need to be changed or questions increase substantially as young women get older.65 that don’t pick up the information we’re looking for? Based on these practice interviews, we can fix the questions or wording in the survey and test the new version with a new practice respondent. The goal is for enumerators to be 65 Field et al. 2018, 31. able to conduct practice interviews in which respondents

povertyactionlab.org 37 ste p 3. data collec tion instruments

can understand and answer all the questions well. There 2. Piloting logistics includes test runs of the entire is no right number of pretests to conduct; we may find data collection process, enabling us to identify possible that fifteen are enough, or we may need to do up to fifty. sources of measurement error. Enumerators practice It can be useful for several enumerators to participate in executing the full survey from start to finish, including pretesting the survey, as differences in enumerator style recruiting respondents, conducting surveys, running can affect responses, and different enumerators will notice non-survey instruments, digital data entry, and data different problems with the questionnaire. If we are using quality checks. It is important to test the logistics for non-survey instruments during or separate from our complicated or unpredictable data collection processes, survey, it is critical to pilot them the same way we pilot such as recruiting survey respondents at events, or for surveys. Some additional pretesting techniques include:66 testing technology in the field. Piloting logistics helps ensure that there are sufficient resources to implement the • Cognitive interviews: interviewees respond to the survey, that enumerators fit the job description well, and enumerator by describing their ongoing ideas and that the administrative processes work. Researchers can thought processes in addition to answering the survey also incorporate pretesting techniques into the piloting. question. This can help researchers learn about whether Overall, logistics piloting can be costly and time intensive, and how people understand their survey questions and involving dozens of respondents. As such, if we are familiar identify the roots of any confusion or difficulties people and comfortable with the enumerators and administrative have in answering them.67 systems in a given context, it might not be necessary to do a full-scale piloting process. • Respondent debriefings: researchers implement the survey and collect comments and feedback from While the time needed to validate the survey instruments respondents after they complete it. This can be easily varies, it is important to budget at least a few weeks before and cost-effectively incorporated into pretesting. the actual survey is scheduled to begin to update and • Expert reviews: experienced specialists in a finalize the survey instruments. particular topic or sector provide comments on the content or style of the survey.

66 For practical insights on pretesting and piloting logistics, see:

Caspar, Rachel, Emilia Peytcheva, Ting Yan, Sunghee Lee, Mingnan Liu, and Mengyao Hu. 2016. “Pretesting.” In Cross-Cultural Survey Guidelines. Ann Arbor, MI: Survey Research Center, Institute for Social Research, University of Michigan. Accessed March 19, 2018. http://ccsg.isr.umich.edu/index.php/ chapters/pretesting-chapter#one.

Ruel, Erin E., William E. Wagner, III, and Brian J. Gillespie. 2015. The Practice of Survey Research: Theory and Applications. Los Angeles: SAGE.

67 For an example of how cognitive interviewing helped strengthen the wording of survey questions, see: Malapit, Hazel J., Kathryn Sproule, and Chiara Kovarik. 2017. “Using Cognitive Interviewing to Improve the Women’s Empowerment in Agriculture Index Survey Instruments: Evidence from Bangladesh and Uganda.” Agri-Gender: Journal of Gender, Agriculture, and Food Security 2 (2): 1-22. http://agrigender.net/views/cognitive-interviewing-to-improve-women- location: bangladesh. photo: ishita ahmed | j-pal/ipa empowerment-from-bangladesh-and-uganda-JGAFS-222017-1.php.

povertyactionlab.org 38 ste p 3. data collec tion instruments

what are we looking for in pretests and Do our instruments pick up variation? Our surveys and piloting logistics? other data collection instruments must be sensitive enough to pick up the variation found in the population where our Have we chosen the right people to interview? In the research is taking place. For example, if we are trying to course of piloting we may discover that the people we measure women’s knowledge about their civil and political are surveying do not have the information we want. An rights, the questions we use to assess their knowledge should adolescent girl may not know how much her school fees are. not be so easy that most respondents get all of the questions Her parents and teachers may be more likely to know. If right nor so hard that very few get any right. If we anticipate women and men are in charge of planting different crops that answers to some questions may be similar, we should on a farm, women may only have information about planting pretest the questions first to verify that this is the case. decisions for the traditionally female crops and men for Pretesting can help us find the right combination of the traditionally male crops. If we observe a clear pattern questions to get a range of different responses. that several respondents do not know the information we are asking them about, we should consider other potential Is the plan appropriate given the culture and politics data sources for these questions. In the first example that of the local context? Both our survey questions and data could mean asking parents, and in the second example it collection plan should be tailored to the local context. Say could mean surveying men in addition to women. We must we want to examine whether women have opportunities carefully weigh the benefits of receiving more accurate for collective action in their community. We need to ask data on a given indicator against the increased cost of questions about types of collective action that make sense adding a new survey or group of respondents. and actually occur in the specific context. Women may participate in rallies, protests, and issue-based political Have we worded our questions well, and do people campaigns in some contexts, but in others these types of understand them? Respondents may express confusion collective action may be rare or inaccessible to women so about what a question is asking, or they may respond with they would not be relevant to include as survey responses. an answer that shows they did not understand the question We can learn these details during the formative research in the same way we did. See Box 8 for one example of phase through semi-structured interviews, participant a survey question that needed clarification. We should observation, and spending time in communities similar to update our survey as frequently as needed during piloting where the evaluation will take place. We can also double to make sure all questions are well understood. check this information when pretesting our survey instruments.

box 8. working towards precise and During pretesting, we may find that the way we plan locally tailored survey questions to conduct our survey may not be appropriate given the culture and politics of the area. For example, it may be In an evaluation Rachel and co-authors conducted in easier to interview young women at school rather than at Sierra Leone, an early version of our survey contained home to reduce survey costs. Yet parents and local leaders the question, “do you belong to any social groups?” may not think it is appropriate for enumerators to survey and many pilot participants replied, “no.” After asking girls alone in a place outside their homes. In this case, follow-up questions, our enumerators came to find that people did belong to what we considered social surveying girls at home would be more appropriate, though groups—mosques, churches, farming or business more expensive. We often cannot anticipate all of the cooperatives—they just did not label them “social community’s preferences, so it is important to consult local groups” like we did. We amended the question to leaders during the formative research phase and pilot our prompt people with a list of the most common groups survey processes before starting the official data collection. we found people belonged to in our pilot surveys.

povertyactionlab.org 39 ste p 3. data collec tion instruments

Is our survey too long? People can become fatigued m ain takeaways for data during long surveys and their answers may become less collection instruments accurate as time passes. Pretesting can help us determine whether our survey is too long and identify questions that • Most quantitative impact evaluations will use could be cut. We may also consider placing the bulk of surveys as the main data collection instrument. questions that require more mental energy earlier on in the There is also a wide array of non-survey instruments survey or even conduct the data collection over multiple that generate quantitative data that researchers can use for outcomes that are hard to measure visits. Pretesting will give us a good estimate of how long or that people may not speak about accurately or the questionnaire takes, which we include in our script honestly (i.e., prejudice or gender bias). asking for respondents’ consent to participate in the study prior to beginning the exercise. We should note that as • Incorporating qualitative interviews with a subset enumerators get used to implementing the survey, the of our sample can also help us better understand process typically speeds up by about one-third. particular phenomena and/or generate hypotheses about why a program did or did not work.

Are the recall periods appropriate? It is nearly impossible • Before we write our survey questions, we need to to assess whether our recall periods are appropriate without pick the right people to interview. In many cases it posing our survey questions to people. If we find during is useful to survey multiple sources (e.g., girls and pretesting that the question is too challenging for most their parents; both men and women). respondents to answer precisely, we may need to reformulate • When writing survey questions, it is important it. During field testing, we can even test two recall period to make them specific, neutral, understandable, variations to see which one gives us the best information. clearly framed, and relevant.

• Pretesting our survey questionnaire with people similar to our target population is critical for ensuring questions make sense and pick up the information we are interested in. Cognitive interviewing and respondent debriefings can also be useful in these efforts.

• We need to make sure the sequencing, length, and recall periods work; that the questions are well understood and culturally appropriate; and that they pick up variation.

• We should repeatedly pretest and update the questionnaire until all questions are well understood and generate the information we want.

• If we have less experience collecting data in the context where the evaluation will take place, we should also pilot the survey logistics, including everything from finding and enrolling respondents, administering the questionnaire, entering data, and doing data quality checks.

povertyactionlab.org 40 location: colombia. photo: paul smith | j-pal/ipa ste p 4. data collec tion pl an Design a data collection plan that minimizes measurement error

The last step before officially launching the evaluation is who collects the data? to make sure our data collection process itself does not The identity of the enumerator can affect the answers introduce significant opportunities for measurement error. people give. Study participants may feel less free to The way we collect data can affect its accuracy. We have to speak openly about certain topics with enumerators of a think carefully about how often, when, and where we will particular gender, class, and/or ethnicity. As a general rule collect data, in addition to who we select as enumerators. of thumb, enumerators should be from the country where All of these factors have a gender component that should the research is taking place and, when possible, from a be considered and included in the data collection strategy. similar region of the country. Enumerators should also be If we’re conducting a randomized evaluation, it is also fluent in the languages spoken in the areas where they are imperative that all aspects of data collection—the conducting surveys, including local languages. Pretesting is enumerators, questionnaires, the timing and frequency a good opportunity to investigate whether the identity of of surveys—be identical for the treatment group and the enumerators affects respondents’ answers. comparison group.68 In general, when we are asking people about topics related to women or girls’ empowerment, it is best if the enumerator is the same gender as the respondent. This is particularly true in contexts that have social norms 68 Data collection differences between the two groups could lead to non-classical measurement error, meaning error that it is correlated with treatment status, regulating the interaction between men and women. For which leads to biased impact estimates. The good news is that measurement example, in the evaluation that Rachel and co-authors error sources are less troublesome in randomized evaluations if we can reasonably assume that they happened at approximately the same rates in conducted in Bangladesh, the young women participating the treatment and comparison groups. For example, if people consistently were always interviewed by a woman because it was misunderstand how certain questions were phrased across both groups, our socially inappropriate for a woman and a man from outside impact estimate won’t be biased.

povertyactionlab.org 41 ste p 4. data collec tion pl an

her family to interact without a chaperone. If it is not do we always need to intervie w possible to match respondents with enumerators of the respondents alone? same gender, we can collect and record data on enumerator Interviewing someone in the presence of another person gender and test whether answers vary systematically by can bias his or her answers. To eliminate this potential enumerator gender in our analysis. Gender is not the bias, our best option is to interview each respondent alone. only aspect of identity that can affect data collection. For When setting up the interviews, researchers should let example, a young woman from a low-income family in a respondents know that the interview should take place rural community may feel uncomfortable speaking with privately so that they can plan accordingly. If there is not an enumerator from a middle- or high-income background enough space to interview respondents alone in their living from a major city. quarters, we may need to identify nearby locations where we can use a more private space. Nonetheless, some respondents would not feel free to speak honestly about certain topics, such as intimate partner Yet, interviewing respondents alone can be challenging violence or sexual assault, with any enumerator. In contexts when working with populations that lack power, particularly where respondents are literate, one alternative to provide if family members feel that it is inappropriate. Say we increased privacy for survey questions on particularly are interested in interviewing young women about their sensitive subjects could be to offer respondents the ability aspirations related to their careers, getting married, and to answer directly on a tablet, phone, or computer. Audio having children, but during piloting we find that some Computer-Assisted Self-Interview (ACASI) technology parents express concern about strangers speaking with involves listening to pre-recorded interview questions on their daughters alone. One option is to offer to interview headphones and answering the questions using a digital the young woman within sight—but not earshot—of her device and can be adapted to contexts with low literacy. parents, such as on the porch or just outside the house. A well-written parental consent script can also help mitigate Because enumerator identity matters, using the same some concerns parents have about letting their children enumerators in both treatment and comparison groups is participate. The script should thoroughly explain the key. Asking existing program staff to collect data in the purpose of the research, the organization the enumerator treatment group and hiring enumerators or a survey firm represents, the complete anonymity and privacy of their to conduct surveys in the comparison group can seem like daughter’s responses, and give parents the opportunity to a good way to save on research costs. Yet, respondents ask follow-up questions before giving consent. may feel less free to answer questions honestly when being interviewed by a staff member they recognize from an Women with young children may often need to look after organization that has provided them with a program.69 them during the interview. If there is no one else who can Enumerators who are skilled in administering surveys and look after her young children during the questionnaire, the who know how to avoid asking leading questions are more enumerator can proceed with the survey but make note likely to collect high-quality data than individuals who do of who else was present. The questionnaire should include not have experience conducting research. a prompt asking the enumerator to record whether the respondent was alone and, if they were not alone, who else was present and a short explanation of why it was not possible to interview the respondent alone. We can check these notes regularly (on a daily or weekly basis) to develop new strategies that enumerators can try to overcome this and other challenges in future surveys.

69 This could lead to non-classical measurement error—measurement error that is systematically different between the treatment and comparison groups— which will bias our impact estimates.

povertyactionlab.org 42 ste p 4. data collec tion pl an

when and where do we intervie w people? The time of year can also affect how individuals respond to There are certain times when respondents may be more our surveys. For example, men and women in agricultural or less available to participate in a survey. Pretesting and households may have different workloads during the logistics piloting helps us document how many people we planting season compared to the harvest season, meaning were able to reach at various times of day and whether that a time use survey could generate very different data if there were any days of the week when people were less we conduct it during planting season versus harvest season. available. We can use this information to create a survey timetable that fits respondents’ needs and availability. Where we interview respondents can also substantially affect the cost of data collection. Interviewing students Study participants may be more available during certain at school will be much less expensive than interviewing times of year. In contexts where many secondary school each of them at home. Similarly, interviewing women students study in boarding schools, it may be easier to at the monthly savings group meeting place will cost interview adolescent girls at home during their school less than a household survey. We may need to follow up holidays. Farmers may be less available once planting individually with the respondents we cannot reach in these season or harvest has started. Since many young women central locations, but this will still be less expensive than in Bangladesh migrate to cities to work in factories, we individually surveying each respondent at home. found that it was easier to interview them during a month with many holidays when many young women travel Survey location is also important to pretest. For example, home to visit their families. we may think it will be more efficient to interview women entrepreneurs in an outdoor market during the day than to interview them at home, but find that many women cannot stop working in order to answer questions. In this instance, we could consider conducting as many surveys in the market as possible and follow up with the women who could not participate in the survey at a less busy time, a different day of the week, or at home. If we decide to conduct interviews in public places rather than individuals’ homes, we should be sure to collect enough personal and contact information to be able to follow up with respondents later in subsequent survey rounds.

when should we start and end data collection and how frequently should we collect data? A program’s effects are usually not instantaneous. We should begin to measure outcomes after the program has had an adequate amount of time to achieve its effects. It can be difficult to predict exactly how long this takes, particularly for complex processes like empowerment. Our theory of change can help us think through how long it will take for location: rwanda. photo: tom gilks | j-pal/ipa the program to start to have an impact. Consider a business training program for women entrepreneurs that aims first to improve their business management knowledge, then improve their business management practices, and as a result, increase their profits. We may expect knowledge

povertyactionlab.org 43 ste p 4. data collec tion pl an

and business practices to improve within the first couple m ain takeaways for data of months after the program ends. Yet, it can take some collection pl an time for business practice changes to result in changes in profits, so we want to be careful not to measure our final • It is not just our survey content that affects the outcomes too early. Measuring outputs and intermediate accuracy of our data—who conducts the survey, outcomes along the way can also help us check whether when, and how often, also matter. our timeline for measuring final outcomes is reasonable. If the program is taking longer to affect our intermediate • Enumerator identity can bias people’s responses, outcomes than originally expected, we could consider so we need to ensure that respondents are comfortable with their enumerators. It is often pushing back the start date of our endline survey. most effective to have women interview women and men interview men. Several factors can influence the timing for the final round of data collection, including cost, attrition, whether • Find a reliable but culturally appropriate way to longer-term results are needed, and whether the results interview respondents alone. are needed by a certain date to make a particular decision. • Select a time that is convenient for respondents In Rachel and coauthors’ evaluation in Bangladesh, the but that also allows us to maximize the number of research team was primarily interested in learning whether surveys we can complete. various programs helped young women delay marriage into adulthood, so we needed to wait several years until • Collect data following a timeline that allows for outcomes to materialize according to the program’s the girls began to get married to understand this impact. theory of change. Research costs and the likelihood of attrition increase as the follow-up period gets longer. A new program could • Use the same data collection processes and also have novelty effects, where people are more excited enumerators in treatment and comparison groups about a program when it first starts, automatically raising to avoid measurement error that is systematically outcomes in the short term. If this seems likely, we should different across treatment and comparison groups. not end data collection within the first few months after the program starts. Additionally, if we expect our program to have a lasting effect on people’s lives, we should not end data collection sooner than six months or a year after the program ends.

What about data collection frequency? Collecting data more frequently increases costs, but it can also help us better capture intermediate outcomes and observe changes over time. If the impact of the program is likely to change (or even decay) over time, and if this information will be used to make a decision about investing more in a program, we may want to collect data more frequently and over longer periods.

povertyactionlab.org 44 location: india. photo: jasleen kaur | j-pal conclusion

Empowerment is a complex process, and measuring it can concrete ways to address many of the common challenges be challenging. Yet, collecting good data on where women of measuring empowerment, there is still a great need lack power to make choices that are important to them and for more systematic measurement research and validation using impact evaluations to identify effective ways to help exercises to improve on our current approaches. By women gain greater agency are critical inputs for reducing continuing to experiment with, refine, and improve how gender inequality worldwide. we measure empowerment, researchers and practitioners can generate evidence on approaches that can help women This guide has just scratched the surface of the many and girls realize their own vision of a better life. creative strategies researchers and practitioners from diverse disciplines have developed to help us better understand, measure, and track changes in women’s and girls’ empowerment. Building on this large body of work, we hope we have shed light on how to weigh the trade offs between using different types of measurement tools and outlined a process for developing a measurement strategy that is right for the evaluation at hand. While there are

povertyactionlab.org 45 re fe re nces

Abdul Latif Jameel Poverty Action Lab (J-PAL). N.d. “Ethics.” Bandiera, Oriana, Robin Burgess, Markus Goldstein, Niklas Buehren, Research Resources. Accessed March 19, 2018. https://www. Selim Gulesci, Imran Rasul, and Munshi Sulaiman. “Women’s povertyactionlab.org/research-resources/ethics. Empowerment in Action: Evidence from a Randomized Control Trial in Africa.” World Bank Group Working Paper, July 2017. http:// Abdul Latif Jameel Poverty Action Lab (J-PAL). N.d. “Measurement documents.worldbank.org/curated/en/707081502348725124/ & Data Collection.” Research Resources. Accessed March 19, 2018. Women-s-empowerment-in-action-evidence-from-a-randomized- https://www.povertyactionlab.org/research-resources/measurement- control-trial-in-Africa. and-data-collection. Beaman, Lori, Raghabendra Chattopadhyay, Esther Duflo, Rohini Ager, Alastair, Lindsay Stark, and Alina Potts. 2010. “Participative Pande, and Petia Topalova. 2009. “Powerful Women: Does Exposure Ranking Methodology: A Brief Guide: Version 1.1.” New York, Reduce Bias?.” The Quarterly Journal of Economics 124 (4): 1497-1540. NY: Program on Forced Migration & Health, Mailman School of https://doi.org/10.1162/qjec.2009.124.4.1497. Public Health, Columbia University. https://doi.org/10.13140/ RG.2.2.34356.45448. Bryson, John M. and Michael Quinn Patton. 2010. “Analyzing and Engaging Stakeholders.” Handbook of Practical Program Evaluation Alkire, Sabina. 2005. “Subjective Quantitative Studies of Human Fourth Edition: 36-61. https://experts.umn.edu/en/publications/ Agency.” Social Indicators Research 74 (1): 217-260. https://doi. analyzing-and-engaging-stakeholders. org/10.1007/s11205-005-6525-0. Buchmann, Nina, Erica Field, Rachel Glennerster, Shahana Nazneen, Alkire, Sabina, Ruth Meinzen-Dick, Amber Peterman, Agnes Svetlana Pimkina, and Iman Sen. “Power vs Money: Alternative Quisumbing, Greg Seymour, and Ana Vaz. 2013. “The Women’s Approaches to Reducing Child Marriage in Bangladesh, a Randomized Empowerment in Agriculture Index.” World Development 52: 71-91. Control Trial.” Working Paper, May 2017. https://doi.org/10.1016/j.worlddev.2013.06.007. Buvinic, Mayra, and Rebecca Furst-Nichols. 2013. “Measuring Almås, Ingvild, Alex Armand, Orazio Attanasio, and Pedro Carneiro. Women’s Economic Empowerment: Companion to a Roadmap for “Measuring and Changing Control: Women’s Empowerment and Promoting Women’s Economic Empowerment.” United Nations Targeted Transfers.” NBER Working Paper No. w21717, 2015. http:// Foundation and Exxon Mobil Foundation. http://womeneconroadmap. www.nber.org/papers/w21717.pdf. org/sites/default/files/Measuring%20Womens%20Econ%20Emp_ FINAL_06_09_15.pdf. Ambler, Kate, Cheryl Doss, Caitlin Kieran, and Simone Passarelli. 2017. “He Says, She Says: Exploring Patterns of Spousal Agreement in Buvinic, Mayra and Ruth Levine. 2016. “Closing the Gender Data Bangladesh.” International Food Policy Research Institute Discussion Gap.” Significance 13_c12j: (2) 34-37. Paper 01616, March 2017. http://ebrary.ifpri.org/cdm/ref/collection/ p15738coll2/id/131097. Casey, Katherine, Rachel Glennerster, and Edward Miguel. 2012. “Reshaping Institutions: Evidence on Aid Impacts Using a Preanalysis Angelucci, Manuela, Dean Karlan, and Jonathan Zinman. 2015. Plan.” The Quarterly Journal of Economics 127 (4): 1755-1812. https:// “Microcredit Impacts: Evidence from a Randomized Microcredit doi.org/10.1093/q je/q je027. Program Placement Experiment by Compartamos Banco.” American Economic Journal: Applied Economics 7 (1): 151-82. 174. http://dx.doi. Caspar, Rachel, Emilia Peytcheva, Ting Yan, Sunghee Lee, Mingnan org/10.1257/app.20130537. Liu, and Mengyao Hu. 2016. “Pretesting.” In Cross-Cultural Survey Guidelines, Ann Arbor, MI: Survey Research Center, Institute Ashraf, Nava, Dean Karlan, and Wesley Yin. 2006. “Household for Social Research, University of Michigan. Accessed March 19, Decision Making and Savings Impacts: Further Evidence from a 2018. http://ccsg.isr.umich.edu/index.php/chapters/pretesting- Commitment Savings Product in the Philippines.” Paper Presented chapter#one. at Yale University Economic Growth Center, New Haven, CT, June 2006. 29. https://ideas.repec.org/s/egc/wpaper.html. Chattopadhyay, Raghabendra, and Esther Duflo. 2004. “Women as Policy Makers: Evidence from a Randomized Policy experiment Ashraf, Nava, Dean Karlan, and Wesley Yin. 2010. “Female in India.” Econometrica 72 (5): 1409-1443. https://doi.org/10.1111/ Empowerment: Impact of a Commitment Savings Product in j.1468-0262.2004.00539.x. the Philippines.” World Development 38 (3): 333-344. https://doi. org/10.1016/j.worlddev.2009.05.010.

povertyactionlab.org 46 re fe re nces

Demographic and Health Surveys Program. N.d. “Women’s Glennerster, Rachel, and Kudzai Takavarasha. 2013. Running Status and Empowerment.” The DHS Program. Accessed March Randomized Evaluations: A Practical Guide. Princeton: Princeton 19, 2018. https://dhsprogram.com/Topics/Womens-Status-And- University Press. Empowerment.cfm. Glennerster, Rachel, and Claire Walsh. 2017. “Is It Time to Re-Think Demographic and Health Surveys Program. N.d. “Family Planning.” How We Measure Women’s Household Decision-Making Power?” The DHS Program. Accessed March 19, 2018. https://dhsprogram. Abdul Latif Jameel Poverty Action Lab, September 7, 2017. https:// com/Topics/Family-Planning.cfm. www.povertyactionlab.org/blog/9-6-17/it-time-rethink-how-we- measure-women’s-household-decision-making-power-impact. Demographic and Health Surveys Program. N.d. “The Gender Corner.” The DHS Program. Accessed March 19, 2018. https:// Golla, Anne Marie, Anju Malhotra, Priya Nanda, Rekha Mehra, dhsprogram.com/topics/gender-Corner/index.cfm. Aslihan Kes, Krista Jacobs, and Sophie Namy. 2011. “Understanding and Measuring Women’s Economic Empowerment.” International Demographic and Health Surveys Program. 2017. “DHS Model Center for Research on Women. https://www.icrw.org/publications/ Questionnaire – Phase 7.” The DHS Program. Accessed March 19, understanding-and-measuring-womens-economic-empowerment/. 2018. https://dhsprogram.com/pubs/pdf/DHSQ7/DHS7-Womans- QRE-EN-07Jun2017-DHSQ7.pdf. Grüne-Yanoff, Till, and Sven Ove Hansson, eds. 2009. Preference Change: Approaches from Philosophy, Economics and Psychology. Development Media International. 2016. “Sending Our Writer to Dordrecht: Springer Science & Business Media. Live in a Village for a Week.” DMI News, May 23, 2016. http://www. developmentmedia.net/news/sending-our-writers-to-live-in-a-village- Hennink, Monique, Inge Hutter, and Ajay Bailey. 2011. Qualitative for-a-week. Research Methods. Thousand Oaks, California: SAGE Publications.

Dhar, Diva, Tarun Jain, and Seema Jayachandran. “Intergenerational Innovations for Poverty Action. 2018. “The Safe and Ethical Conduct Transmission of Gender Attitudes: Evidence from India.” NBER of Violence Research: Guidance for IPA Staff and Researchers.” Working Paper No. 21429, July 2015. Updated June 2018. https://www.poverty-action.org/publication/ ipv-ethical-guidance. Donald, Aletheia, Gayatri Koolwal, Jeannie Annan, Kathryn Falb, and Markus Goldstein. “Measuring Women’s Agency.” World Bank Kabeer, Naila. 1999. “Resources, Agency, Achievements: Reflections Policy Research Working Paper no. 8148, July 2017. http://documents. on the Measurement of Women’s Empowerment” Development and worldbank.org/curated/en/333481500385677886/Measuring- Change 30 (3): 435-464. https://doi.org/10.1111/1467-7660.00125. womens-agency. Kabeer, Naila. “Paid Work, Women’s Empowerment and Gender Field, Erica, Rachel Glennerster, and Shahana Nazneen. 2018. Justice: Critical Pathways of Social Change.” Institute of Development “Economic Empowerment of Young Women in Bangladesh: Barriers Studies, Working Paper, January 2008. https://www.ids.ac.uk/ and Strategies.” Women at Work: Addressing the Gaps. International publication/paid-work-women-s-empowerment-and-gender-justice- Policy Centre for Inclusive Growth Policy in Focus 15 (1): 31-32. critical-pathways-of-social-change. http://www.ipc-undp.org/publication/28507. Krosnick, Jon A., and Stanley Presser. 2010. “Question and Gallup, International Labor Organization. 2017. “Towards a Better Questionnaire Design.” In Handbook of Survey Research, edited by Future for Women and Work: Voices of Women and Men.” ILO- Peter V. Marsden and James D. Wright, 263-315. Bingley: Emerald Gallup Report, March 8, 2017. 74. http://www.ilo.org/global/ Group Publishing. publications/books/WCMS_546256/lang--en/index.htm. Laszlo, Sonia, and Kate Grantham. “Measurement of Women’s García-Moreno, Claudia, Henrica A.F.M. Jansen, Mary Ellsberg, Lori Economic Empowerment in GrOW Projects: Inventory and User Heise, and Charlotte Watts. 2005. “WHO Multi-Country Study on Guide.” McGill University GrOW Working Paper, December 2017. Women’s Health and Domestic Violence against Women: Summary Report of Initial Results on Prevalence, Health Outcomes and Levy Economics Institute of Bard College. n.d. “Publications on Time Women’s Responses.” Geneva: World Health Organization. Poverty.” Accessed March 19, 2018. http://www.levyinstitute.org/ topics/time-poverty.

povertyactionlab.org 47 re fe re nces

Lombardini, Simone, Kimberly Bowman, and Rosa Garwood. Jose, Rupa, Nandita Bhan, and Anita Raj. 2017. “EMERGE 2017. “A ‘How-To’ Guide To Measuring Women’s Empowerment: Measurement Guidelines Report 2: How to Create Scientifically Sharing Experience from Oxfam’s Impact Evaluations.” Oxfam. Valid Social and Behavioral Measures on Gender Equality and https://policy-practice.oxfam.org.uk/publications/a-how-to-guide- Empowerment.” Center on Gender Equity and Health (GEH). San to-measuring-womens-empowerment-sharing experience-from- Diego, CA: University of California, San Diego School of Medicine. oxfams-i-620271. Schaner, Simone. 2017. “The Cost of Convenience? Transaction Mack, Natasha, Cynthia Woodsong, Kathleem M. MacQueen, Greg Costs, Bargaining Power, and Savings Account Use in Kenya.” Guest, and Emily Namey. 2005. “Qualitative Research Methods: A Journal of Human Resources 52 (4): 919-945. https://doi.org/10.3368/ Data Collector’s Field Guide.” Research Triangle Park: Family Health jhr.52.4.0815-7350R1. International. https://www.fhi360.org/sites/default/files/media/ documents/Qualitative%20Research%20Methods%20-%20A%20 Schmeer, Kammi. 2000. “Stakeholder Analysis Guidelines.” Policy Data%20Collector%27s%20Field%20Guide.pdf. toolkit for strengthening health sector reform 2: 1-43. http://www. who.int/workforcealliance/knowledge/toolkit/33.pdf. Malapit, Hazel J., Kathryn Sproule, and Chiara Kovarik. 2017. “Using Cognitive Interviewing to Improve the Women’s Empowerment in Sen, Amartya. 1999. Development as Freedom. New York: Alfred A. Knopf. Agriculture Index Survey Instruments: Evidence from Bangladesh and Uganda.” Agri-Gender: Journal of Gender, Agriculture, and Food Security. Seymour, Greg, Hazel Jean Malapit, and Agnes R. Quisumbing. 2 (2): 1-22. http://agrigender.net/views/cognitive-interviewing- “Measuring Time Use in Development Settings (English).” World to-improve-women-empowerment-from-bangladesh-and-uganda- Bank Policy Research Working Paper No. WPS 8147, July 2017. JGAFS-222017-1.php. http://documents.worldbank.org/curated/en/443201500384614625/ Measuring-time-use-in-development-settings. Malhotra, Anju, Sidney R. Schuler, and Carol Boender. 2002. “Measuring Women’s Empowerment as a Variable in International Seymour, Greg, and Amber Peterman. 2017. “Understanding the Development.” Background paper prepared for the World Bank Measurement of Women’s Autonomy: Illustrations from Bangladesh Workshop on Poverty and Gender: New Perspectives, June 28, 2002. and Ghana.” IFPRI Discussion Paper No.1656. https://www.ifpri. http://siteresources.worldbank.org/INTGENDER/Resources/ org/publication/understanding-measurement-womens-autonomy- MalhotraSchulerBoender.pdf. illustrations-bangladesh-and-ghana.

Martínez-Restrepo, Susana, and Laura Ramos-Jaimes eds. 2017. United Nations Foundation and Exxon Mobile Foundation. 2013. Measuring Women’s Economic Empowerment: Critical Lessons from South “Women’s Economic Empowerment: A Roadmap.” Last accessed America. Springfield, VA: IDRC, Fedesarrollo. http://hdl.handle. March 20, 2018. http://womeneconroadmap.org/. net/11445/3482. U.S. Department of Health and Human Services Office for Human Miller, Gran, and A. Mushfiq Mobarak. “Gender Differences in Research Protections. 2016. “Special Protections for Children as Preferences, Intra-household Externalities, and Low Demand for Research Subjects.” Guidance. Accessed March 19, 2018. https:// Improved Cookstoves.” NBER Working Paper No. 18964, April 2013. www.hhs.gov/ohrp/regulations-and-policy/guidance/special- http://www.nber.org/papers/w18964.pdf. protections-for-children/index.html.

Patton, Michael Quinn. 2015. Qualitative Research & Evaluation Waring, Marilyn, and Gloria Steinem. If Women Counted: A New Methods: Integrating Theory and Practice Fourth Edition. Thousand Feminist Economics. San Francisco: Harper & Row, 1988. Oaks, California: SAGE Publications. Watkins, Ryan, Maurya West Meiers, and Yusra Laila Visser. 2012. “A Raworth, Kate, Caroline Sweetman, Swati Narayan, Jo Rowlands, and Guide to Assessing Needs: Essential Tools for Collecting Information, Adrienne Hopkins. 2012. “Conducting Semi-Structured Interviews.” Making Decisions, and Achieving Development Results.” Washington, Oxfam. https://policy-practice.oxfam.org.uk/publications/ DC: World Bank. http://hdl.handle.net/10986/2231. conducting-semi-structured-interviews-252993.

Ruel, Erin E., William E. Wagner, III, and Brian J. Gillespie. 2015. The Practice of Survey Research: Theory and Applications. Los Angeles: SAGE.

povertyactionlab.org 48 a bout j -pa l The Abdul Latif Jameel Poverty Action Lab (J-PAL) is a global research center working to reduce poverty by ensuring that policy is informed by scientific evidence. Anchored by a network of more than 160 affiliated J-PAL Europe Paris School of Economics professors at universities around the world, J-PAL Global J-PAL North America J-PAL draws on results from randomized MIT MIT impact evaluations to answer critical J-PAL South Asia questions in the fight against poverty. IFMR, India povertyactionlab.org a bout th e ge n de r sec tor J-PAL Southeast Asia Gender norms and biases continue to University of Indonesia constrain human potential around the world. J-PAL Latin America J-PAL’s Gender sector produces cross-cutting & Caribbean insights on promoting gender equality and PontificiaUniversidad J-PAL Africa women’s and girls’ empowerment and on Católica de Chile University of Cape Town how social norms related to gender affect the outcomes of social programs. povertyactionlab.org/gender