THE ORIGIN, EVOLUTION, AND VARIATION OF ROUTINE STRUCTURES IN OPEN SOURCE SOFTWARE DEVELOPMENT: THREE MIXED COMPUTATIONAL-QUALITATIVE STUDIES

by ARON LINDBERG

Submitted in partial fulfillment of the requirements

for the degree of Doctor of Philosophy

Dissertation Committee:

Kalle Lyytinen, PhD, Case Western Reserve University (chair)

Fred Collopy, PhD, Case Western Reserve University

Richard Boland, PhD, Case Western Reserve University

Jagdip Singh, PhD, Case Western Reserve University

Youngjin Yoo, PhD, Temple University

James Howison, PhD, University of Texas-Austin

WEATHERHEAD SCHOOL OF MANAGEMENT

DEPARTMENT OF DESIGN & INNOVATION

CASE WESTERN RESERVE UNIVERSITY

August 2015

CASE WESTERN RESERVE UNIVERSITY

SCHOOL OF GRADUATE STUDIES

We hereby approve the thesis/dissertation of

Aron Lindberg

candidate for the Doctor of Philosophy degree*.

(signed) Kalle Lyytinen

(chair of the committee)

Richard Boland

Fred Collopy

Jagdip Singh

Youngjin Yoo

James Howison

(date) May 19, 2015

*We also certify that written approval has been obtained for any proprietary material contained therein.

Copyright © 2015 by Aron Lindberg All rights reserved

Dedication

Tove Jansson, in her book Moominpappa at Sea, wrote:

”But on an occasion like this we must wait for sunset. Setting out in the right way is just as important as the opening lines in a book: they determine everything.” He sat in the sand next to Moominmamma. “Look at the boat,” he said. “Look at The Adventure. A boat by night is a wonderful sight. This is the way to start a new life, with a hurricane lamp shining at the top of the mast, and the coastline disappearing behind one as the whole world lies sleeping. Making a journey by night is more wonderful than anything in the world.” This work is dedicated to my fiancée, Hope Lu, without whom I could never have launched the journey, which this dissertation is the start of.

Table of Contents ACKNOWLEDGMENTS ...... 5 ABSTRACT ...... 8 INTRODUCTION ...... 10 THEORETICAL FRAMING ...... 19 THE INFORMATION PROCESSING VIEW ...... 19 SOCIAL STRUCTURE IN ORGANIZATIONS ...... 22 RESEARCH QUESTIONS ...... 32 RESEARCH DESIGN ...... 33 CHARACTERISTICS OF DIGITAL TRACE DATA ...... 38 SAMPLING & DATA COLLECTION ...... 49 ANALYSES ...... 53 ASSESSING VALIDITY ...... 59 OVERVIEW OF THREE COMPLEMENTARY STUDIES ...... 62 ORIGIN: A CROSS-SECTIONAL STUDY ...... 64 EVOLUTION: A LONGITUDINAL STUDY ...... 65 VARIATION: A COMPARATIVE STUDY ...... 65 THE STRUCTURE OF A THREE-PAPER DISSERTATION ...... 66 STUDY #1: THE ORIGIN OF ROUTINES AS PROBLEM SOLVING MECHANISMS ...... 67 COORDINATION AND VARIATION IN OSS DEVELOPMENT ...... 70 AN INFORMATION PROCESSING VIEW OF OSS DEVELOPMENT ...... 72 A CROSS-SECTIONAL STUDY OF OSS ROUTINES: TASK COMPLEXITY AND ROUTINE VARIETY ...... 78 THEORIZING: A PROCESS MODEL OF OSS AS AN INFORMATION PROCESSING SYSTEM ...... 108 DISCUSSION ...... 116 STUDY #2: THE EVOLUTION OF ROUTINES AS RESPONSES TO ENVIRONMENTAL SHIFTS ...... 119 LIFECYCLES IN OSS DEVELOPMENT ...... 121 EXPLAINING HETEROGENEITY IN OSS DEVELOPMENT ROUTINES ACROSS TIME ...... 123 A LONGITUDINAL STUDY OF OSS ROUTINES: SHIFTS IN ROUTINE COMPOSITIONS ...... 126 THEORIZING: TOWARDS AN OSS LIFECYCLE MODEL ...... 153 DISCUSSION ...... 157 STUDY #3: THE VARIATION OF ROUTINES AS RESPONSES TO ORGANIZATIONAL AND TECHNICAL CONDITIONS ...... 160 EXPLORATORY DATA MINING USING SEQUENCE ANALYSIS ...... 163 ROUTINE CHARACTERISTICS ...... 168 EXPLAINING HETEROGENEITY OF OSS DEVELOPMENT ROUTINES ACROSS PROJECTS ...... 170 A COMPARATIVE STUDY OF OSS ROUTINES: EXPLAINING ROUTINE HETEROGENEITY ...... 172 AN INFORMATION PROCESSING VIEW OF OSS DEVELOPMENT ...... 173 THEORIZING: EXPLAINING PATTERNS OF ROUTINE HETEROGENEITY ...... 188 DISCUSSION ...... 192 CONCLUSION ...... 196 THEORETICAL CONTRIBUTIONS ...... 197 METHODOLOGICAL CONTRIBUTIONS ...... 202 PRACTICAL CONTRIBUTIONS ...... 204

1

LIMITATIONS & FUTURE RESEARCH ...... 206 APPENDIX A: DATA EXTRACTION R/SQL QUERY ...... 210 APPENDIX B: DATA EXTRACTION RUBY SCRIPT ...... 212 APPENDIX C: DATA EXTRACTION R SCRIPT ...... 213 APPENDIX D: DATA PROCESSING R SCRIPT ...... 216 APPENDIX E: SEMI-STRUCTURED INTERVIEW PROTOCOL ...... 222 REFERENCES ...... 224

2

List of Tables

TABLE 1. FOUR DIMENSIONS OF ORGANIZATIONAL SOCIAL STRUCTURE ...... 25 TABLE 2. KEY ASPECTS OF ROUTINIZED STRUCTURE IN OPEN SOURCE ORGANIZING ...... 29 TABLE 3. ASPECTS OF DIGITAL TRACES AT GITHUB ...... 47 TABLE 4. DEFINITIONS OF ACTIVITY TYPES ...... 49 TABLE 5. SAMPLE SIZES ...... 52 TABLE 7. OVERVIEW OF THE THREE STUDIES IN THIS DISSERTATION ...... 63 TABLE 8. DEFINITIONS FOR IPV CONCEPTS IN OSS ...... 77 TABLE 9. ACTIVITY FREQUENCIES PER CLUSTER ...... 82 TABLE 10. DATA COLLECTION ...... 83 TABLE 11. CODING SCHEME ...... 87 TABLE 12. SUMMARY OF CLUSTER CHARACTERISTICS ...... 93 TABLE 13. REPERTOIRE OF ACTIVITIES – TRIAGING TASKS ...... 96 TABLE 14. REPERTOIRE OF ACTIVITIES – TRANSFERRING INFORMATION ...... 98 TABLE 15. REPERTOIRE OF ACTIVITIES – DISCOURSE-DRIVEN PROBLEM SOLVING ...... 102 TABLE 16. REPERTOIRE OF ACTIVITIES – DIRECT PROBLEM SOLVING ...... 105 TABLE 17. LOGIT REGRESSIONS OF CODE MERGING ...... 106 TABLE 18. REGRESSIONS OF DURATION ...... 107 TABLE 19. INFORMATION PROCESSING RESPONSES TO COMPLEXITY ...... 113 TABLE 20. DATA COLLECTION ...... 133 TABLE 21. SUMMARY OF CLUSTER CHARACTERISTICS ...... 139 TABLE 22. ACTIVITY FREQUENCIES ...... 141 TABLE 23. REPERTOIRE OF ACTIVITIES – DISCOURSE-DRIVEN PROBLEM SOLVING ...... 142 TABLE 24. REPERTOIRE OF ACTIVITIES – DIRECT PROBLEM SOLVING ...... 145 TABLE 25. PROPERTIES OF RELEASES ...... 151 TABLE 26. SOCIAL NETWORK STATISTICS ...... 152 TABLE 27. OSS LIFECYCLE MODEL ...... 155 TABLE 28. SAMPLE SIZES ...... 166 TABLE 29. SAMPLE SIZES ...... 172 TABLE 30. DEVELOPMENT HEURISTICS ...... 181 TABLE 31. ROUTINE HETEROGENEITY ...... 188 TABLE 32. CONTRIBUTIONS ...... 197

3

List of Figures FIGURE 1. RESEARCH PROCESS 37 FIGURE 2. THE DUAL NATURE OF DIGITAL TRACE DATA 41 FIGURE 3. THE PROCESS OF GENERATING & ANALYZING DIGITAL TRACES 45 FIGURE 5. RESEARCH PROCESS 79 FIGURE 6. STRUCTURE OF FINDINGS 89 FIGURE 7. EVALUATING CLUSTER SOLUTIONS 90 FIGURE 8. VISUALIZATION OF ROUTINE CLUSTERS 91 FIGURE 9. AGGREGATED ROUTINE STRUCTURE 111 FIGURE 10. COMPUTATIONAL-QUALITATIVE RESEARCH METHOD 126 FIGURE 11. ROUTINE HETEROGENEITY ACROSS TIME 137 FIGURE 12. CLUSTER FIT STATISTICS 138 FIGURE 13. VISUALIZATION OF ROUTINE CLUSTERS 140 FIGURE 14. DISTRIBUTION OF ROUTINE CLUSTERS ACROSS TIME 147 FIGURE 15. DISCOURSE-DRIVEN & DIRECT PROBLEM SOLVING 153 FIGURE 17. MATERIAL STRUCTURES OF CODE BASES 177 FIGURE 18. PROCESS MODEL 191

4

Acknowledgments

It is summer in Sweden, and I am sitting at a library in Halmstad, a small city on the west coast. While I am now an emigree, coming back to Sweden every summer provides me with a sense of rhythmic cyclicality, and now, closure.

On the other side of the Atlantic, Cleveland became a home for me. Strangely enough I have never felt as rooted in any other place, as I have in the rust-mixed soil of the

Midwest. Cleveland is an underdog, and its rugged beauty edged itself into my heart.

This sense of grounding was probably the key condition, which enabled me to engage with the PhD journey in a deep, sustained, and focused manner. As I reflect on this rootedness I am reminded of a number of key individuals who kept me firmly grounded and connected.

I want to thank my fiancée Hope for helping me to carefully consider every crucial decision that has been made along this journey. I am also grateful to my parents, as well as my siblings for providing the terra firma necessary to rebalance myself whenever it was needed.

Crucial to the intellectual growth I have experienced is obviously Kalle Lyytinen, who taught me how to think. I am also grateful to his wife, Pirjo, whose friendship and joie de vivre has been a delight. Also, not to forget, Hessu, our common, furry little friend, who unwittingly provided me with many breaks (walks).

I am grateful to my committee members: Youngjin Yoo, Jagdip Singh, James Howison,

Dick Boland, and Fred Collopy. Your critique has strengthened my thinking and work, and this is an invaluable gift.

5

Further, Nick Berente and James Gaskin have provided endless amounts of coaching, mentoring, and support in crafting the studies in this dissertation. Through you I have learnt the nuts and bolts of scholarship.

The participants of the various research groups that I have been part of have been key to my intellectual maturation and socialization into the scholarly community: Sean Hansen,

Bill Robinson, Amol Kharabe, Tianjie Deng, Xuan Xiao, Mengling Yan, Deepa Gopal,

Sungyong Um, Rob Kulathinal, and Zhewei Zhang.

Working with the Doctor of Management Program at Weatherhead has been a wonderful experience, and has contributed greatly to my growth as a researcher and teacher. I want to especially recognize Sue Nartker and Marilyn Chorman for their support and friendship.

My friends in Cleveland have provided me with a warm environment, and I am grateful for their presence and support: Chris Lyddy, Sagree Sharma and their daughter Amiti

(whose home I briefly shared when I needed it the most), Angela and Jacob Oetama-Paul and their son Elias (breaking bread together provided many thought provoking conversations), Phil & Soldrea Thompson (Go Cavs!), Tim and Leigh Henderson

(interdisciplinary scholarship was apparently insurmountable, but it led to many enjoyable lunches), Tiffany Schroeder (for helping me survive my early struggles),

Colleen Koehler and Matt Campbell (for widening mine and Hope’s culinary horizons).

While I have many friends in Sweden whose friendship I value, I want to especially mention Marcus Linder, who blazed the trail into academia (and other things of importance, such as marriage and children) and provided me with countless opportunities to reflect throughout the years.

6

Finally I am grateful to the Rubinius, Rails, Django, and Bootstrap Open Source communities, who freely shared their experiences with me, and thereby enabled the studies herein to become reality.

In sum, I stand (sit actually) in awe of the complex network that enabled this process to unfold as it has. Thank you.

7

THE ORIGIN, EVOLUTION, AND VARIATION OF ROUTINE STRUCTURES IN OPEN SOURCE SOFTWARE DEVELOPMENT: THREE MIXED COMPUTATIONAL-QUALITATIVE STUDIES

Abstract

by

ARON LINDBERG

Open Source Software (OSS) is a perplexing and complex context for the study of organizing. Effective coordination is often achieved despite weak, ephemeral, and virtually mediated relationships among volunteering developers. Understanding why this is possible and how such coordination forms emerge can provide us with important insights into how to build capabilities for a wider range of digitally mediated collaboration processes. This dissertation explores the origin, evolution, and variation of routines, which contribute to effective coordination of OSS development tasks. The dissertation is comprised of three separate studies. The first study uses computational sequence analysis and qualitative inquiry to conduct a cross-sectional case study of a single OSS project. This study explores how routines emerge and what information processing functions they serve in supporting coordination of OSS development. The second, longitudinal, case study seeks to show how routines evolve over longer periods – in this case across a multiyear release cycle. This serves as the basis for developing an

OSS lifecycle model based on the identified shifts in the dominant routine patterns during consecutive phases of the release cycle. The third, comparative, case study, examines four OSS projects to find out to what extent alternative ‘imprinted’ problem-solving rationalities shape observed variation in routine structures across the four projects. All

8

these three studies deploy new tools and propose new constructs to explain the origin, evolution, and variation of routines in OSS development. While the dissertation limits itself to study routine-based coordination in the OSS context, the models and methods proposed herein can be generalized for a wider range of virtual (design) contexts with shared ‘open’ characteristics providing anybody access to shared design artifacts. These contexts foreshadow many new organizing forms that are likely to become more prominent as myriads of products are increasingly designed through digitally supported and distributed collaboration forms.

Keywords: Open Source Software, Information Processing, Capabilities, Social

Structure, Routines, Evolution, Emergence, Sequence Analysis, Case Study Method

9

Introduction

Open Source Software (OSS) is software that is developed using ‘open’ principles, meaning that the code is publicly available; the code is also created by distributed and autonomous volunteer-developers who work on shared development platforms

(Crowston, Wei, Howison, & Wiggins, 2012). The beginnings of OSS can be traced to the burning desires of a few ambitious individuals – such as Robert Stallman and Linus

Torvalds – who tried to solve particular software design problems they faced in their daily work – in most case writing specific software tools (e.g. text editors such as

EMACS) or operating systems (such as Linux). This fundamental principle of OSS work is illustrated vividly in Eric Raymond’s seminal paper The Cathedral and the Bazaar

(1999: 24-25) when describes his early experiences in developing the email client

‘fetchmail’:

“Since 1993…I had gotten quite used to instant Internet email. For complicated reasons, it was hard to get SLIP to work between my home machine (snark.thyrsus.com) and CCIL… I needed a POP3 client. So I went out on the net and found one. Actually, I found three or four. I used pop-perl for a while, but it was missing what seemed an obvious feature, the ability to hack the addresses on fetched mail so replies would work properly. The problem was this: suppose someone named "joe" on [CCIL] sent me mail. If I fetched the mail to snark and then tried to reply to it, my mailer would cheerfully try to ship it to a nonexistent "joe" on snark. Hand-editing reply addresses to tack on "@ccil.org" quickly got to be a serious pain. This was clearly something the computer ought to be doing for me. But none of the existing POP clients knew how. And this brings us to the first lesson: 1. Every good work of software starts by scratching a developer's personal itch.” From these idiosyncratic desires to solve local, personal problems, a new way of developing software emerged over a period of c.a. 10 years between 1985 and 1995. This new way was mediated by a growing number of technical (internet) platforms – such as

10

e-mail, mail groups, IRC, code hosting and version control systems – and it resulted in increasingly complex software (such as the Linux operating system or the Apache server). As the code and related activities grew increasingly complex, more comprehensive development platforms were built (such as Sourceforge, git, and later

Github). These platforms now coordinate the production of a myriad of different types of software of varying sizes and scope – yet, these projects are still often initiated by a personal ‘itch’ of a single developer who then seeks followers and collaborators to scratch their itch.

At the same time public and corporate information infrastructures have become heavily dependent on OSS products. Such products dominate operating systems (e.g. Linux), servers (e.g. Apache), databases (e.g. MySQL), and programming languages (e.g.

Python). The combination of these particular tools is often referred to as the LAMP stack, which now manages the backrooms of many digital infrastructures. Therefore, understanding how such complex software is designed and developed is becoming increasingly important.

Early OSS research focused primarily on delineating the unique characteristics of this emerging mode of software development (Mockus, Fielding, & Herbsleb, 2002). The structure of the developer communities was characterized quantitatively, whilst qualitative inquiries provided rich descriptions of OSS development processes. In addition, significant attention has been paid to why voluntary developers contribute to

OSS projects. Though, at first, their behavior seems shockingly altruistic, research has uncovered that contributing to OSS projects builds participants’ professional networks and competencies as well as provides vital learning opportunities. While the pay-off is

11

not immediate or necessarily financial, OSS developers do not contribute for altruistic reasons alone (Bonaccorsi & Rossi, 2006; Lakhani & von Hippel, 2003; Roberts & Hann,

2006; Shah, 2006).

Subsequent research has focused on the developer collectives and their behaviors.

Multiple research questions have been addressed such as the growth rates of communities, their social network structures and governance approaches. One main underlying concern has been to understand how it is that a solitary activity, where developers contribute driven by disparate, personal, motivations under weak (or non- existent) governance and management structures, can be coordinated (i.e. through managing task interdependencies) effectively so as to create complex, integrated, high quality software.

Most real world software development projects are complex – a fact we have known since the publication, in the 70’s, of Brooks’ classic The Mythical Man Month (1995).

Many ways of coping with complexity has been proposed, such as software design methodologies and project coordination techniques (Zmud, 1980). The OSS phenomenon, however, is unique among software projects in that it operates under a number of conditions, which are usually viewed as barriers to successful coordination.

The participants and activities are geographically and temporally dispersed and the projects experience high turnover due to the volunteer nature of participation. Therefore, it is often difficult, if not impossible, to apply software development methodologies or formal project coordination mechanisms. Hence, it is pertinent to ask: how can OSS developers coordinate project-wide task interdependencies effectively given their unique features?

12

Tentative answers to this question have begun to emerge. For example, Howison &

Crowston (2014) analyzed socio-technical affordances related to managing code and resulting reduction of complexity in development tasks as one important enabler for OSS coordination. Several Social Network Analysis (SNA) based studies have linked specific network structures between developers to successful outcomes (see e.g. Singh, Tan, &

Mookerjee, 2011). These studies explain “how a solitary activity becomes social” either through artifact design principles or through social structures mediated by technical platforms. They also represent a fruitful area of future research, which has broader implications for understanding organizing of design in the digital age, where time and space constraints matter less. One aspect, which these studies have not examined, is how the actual activities of OSS projects are patterned and organized and how this related to the effectiveness of coordination.

The overall problem that this dissertation deals with is the activity-based coordination of complex software development tasks in the context of OSS. That is, to accomplish such tasks information related to multiple pieces of code including their histories and envisioned trajectories need to be collated, integrated, and processed in a way that that a) enables the writing of functional code, and b) changes the code in a way that is acceptable to the community. For example, a developer may want to fix a certain bug. In order to do this he needs to understand how the bug came about in the first place and what pieces of code it interrelates. As soon as he has collected this information he can begin to process it, i.e. write code to solve the bug. This process has to be enriched with information with regards to appropriate standards for resolving bugs within the particular community that he or she works within. As such, this activity represents a Gordian knot

13

of information that must be processed by single or multiple developers working autonomously.

As previously noted, most research has focused on formal mechanisms that coordinate task interdependencies. From classical organization theory we learn that several (formal) coordination mechanisms such as markets or hierarchies (Powell, 1990), networks

(Benkler, 2006), or formal procedures (March & Simon, 1958) offer powerful means to manage task interdependencies. The growth of software methodologies (e.g. waterfall, agile, XP, etc.) is an attempt to translate one of these general mechanisms (formal procedures) into practical means for coordinating complex software development tasks.

Following such routines help developers to weed out uncertainty and thus manage excess complexity related to software design.

In the context of OSS, however, it is difficult to specify formal coordination mechanisms a priori. One reason is that problem solving often starts as a personally motivated journey to “scratch an itch”, which potentially leads to the formation of a community of developers with shared concerns. In this context it is very difficult to specify a priori how the process and tasks are to be coordinated. Yet, it is not impossible and in later stages of the OSS project it often becomes a necessity. For example, OSS developers need to use technical platforms in specific ways while carrying out their activities (e.g. deploy specific affordances of version control systems such as Github and Sourceforge in specific sequences to record their contributions) to enforce certain design routines. OSS developers also rely on modularization to enable distributed and weakly coupled task coordination. Finally, they exercise dedicated code ownership policies to maintain separation between the spheres of responsibility and influence over segments of code.

14

Due to the lack of pre-designed and formal coordination mechanisms, past explanations of coordination in OSS have primarily rested on the idea of ‘divisible’ coordination achieved through modularization – a process of dividing complex problems, a priori, into smaller, independent, or weakly connected problems (i.e. self-contained tasks) which can be then dealt with individually (MacCormack, Rusnak, & Baldwin, 2006). To facilitate such forms of coordination, the OSS codebase is assumed to exhibit a modular structure even before work on a specific problem starts. A related principle is that of

‘superposition’ (Howison & Crowston, 2014) a work process enabled by modularization in which layers of code are added incrementally to a codebase as developers work on simple problems which do not cut across module boundaries. This helps to reduce task interdependencies, and makes the overall coordination task lighter. These two complementary approaches (modularization and superposition) both act to minimize coordination costs associated with interdependencies and reduce the difficulty level of the problems attacked individually.

However, these accounts essentially leave out of consideration the possible presence and effect of ‘methodologies’ (e.g. agile, XP, waterfall) – or procedurally based coordination

– which commonly are used to coordinate software development. In the context of OSS, such methodologies are not visible (because there is no ‘pre-design’) and there are therefore few well-structured standard operating procedures available to developers.

Rather, such procedures emerge from the situated activity-based interactions among developers, over time. Such activity patterns are generally captured by the idea of a routine (Pentland, 2005) – a habitual and enduring pattern of action. I posit that it is through such routines that OSS developers accomplish their work. In this dissertation I

15

explore specifically how such latent routine structures emerge (the words ‘latent’ and

‘emerge’ are meant to indicate that such routine structures are not ostensively defined but rather exist as evolving patterns in distributed activities). Further, I theorize how such structures enable coordination of development activities through serving important information processing functions. By this I mean that patterned ways of conducting software development provides templates for how pieces of information should be collated, interrelated, and processed so as to write functional code that adheres to established community standards. Defined as such, routine structures serves as functions for processing this information. Specifically, I will study the origin, evolution, and variation of such latent routines and how they relate to coordination challenges created by varying levels of task complexity associated with OSS development.

The idea of recurrent and inherent social structuring of information processing is central to any form of coordination. In order for individuals to coordinate with each other there needs to be some commonly agreed upon patterning of relationships (ongoing structuring) as expressed in observed activities. As noted, such structures can be specified in designed, manifest forms (e.g. organizational structure as defined by a formal hierarchy chart) or in latent forms (e.g. informal power relationships). In the context of software development, such structures are often latently embedded in the habitual ways in which tools and technologies are used or how the software artifact is organized.

Because, as previously mentioned, formal coordination mechanisms often are difficult to implement in OSS development I will apply the idea of emergent social structuring to the study of coordination within this context. Here, however, we run into a number of intellectual problems.

16

First, social structures in OSS have mostly been conceptualized as networks through which resources related to information processing flows, and consequently such structures have been analyzed using SNA. However, as Howison & Crowston (2014) argue, OSS developers conduct most of their work alone. While social networks can explain the degree to which some knowledge resources are shared among developers and thereby contribute to effective coordination, they cannot explain how development procedures become structured over time and how the actual work gets done. In this regard traditional software development is coordinated mainly using ‘methodologies’, such as agile and waterfall. These methodologies provide formal coordination mechanisms such as project management techniques. Methodologies and project management techniques are consciously designed ways of structuring software development as it unfolds across time. However, given the distributed and communitarian nature of OSS projects, opportunities for OSS leaders to enforce specific methodologies are rare. Therefore, as far as we can speak of methodologies in OSS – the structuring of activities over time – these are emergent (and enduring) patterns that arise from myriads of situated activities executed by independent developers dealing with idiosyncratic problems. Hence, structures for coordinating activity in OSS are inherently latent, and we need to invent alternative methods to observe structural properties of such activity patterns. Perceptual measures to what extent developers follow a chosen methodology will not do (if they ever do).

Second, OSS communities are made up of volunteers who contribute to developing the software for idiosyncratic and heterogeneous reasons. They enter and exit specific communities at high rates. Therefore, even if one can elicit the latent coordination

17

structures of OSS projects analytically, we still need to understand how such structures are formed and what they actually do. Hence, we are faced with a dual problem of understanding a) the origin of such structures, and b) how such structures support effective coordination by providing specific information processing functions.

Third, OSS is often seen as a monolithic approach to software development – essentially a single way of developing software products. While this may have been true in the early period of OSS, evidence has mounted recently that the structure of OSS processes varies significantly. Moreover, OSS is increasingly becoming ‘hybridized’ into private- collective (von Hippel & von Krogh, 2003), and corporate sponsored 2.0 forms

(Fitzgerald, 2006). For example, corporate-sponsored developers currently make 80% of the contributions to the Linux kernel. Furthermore, such variation is likely to not only extend across multiple projects, but also over time within an individual project. It is well- known from traditional forms of software development that software processes tend to follow a lifecycle of some kind (Lehman, 1980). While there are indications that the same is true for OSS (Rajlich & Bennett, 2000), there is little empirical evidence of what patterns such lifecycles follow (for a similar attempt in a different open innovation context, see Kane, Johnson, & Majchrzak, 2014). Hence, we need to grapple with the question of how variation in OSS routines comes about across time (evolution) and between projects (variation).

These three problems are all connected by their common concern of coordination of complex software development tasks embedded in varying environments. Providing some tentative, initial solutions to these problems will help scholars and practitioners to better understand the impact of process structures in general and emergence of specific routine

18

structures in particular in coordinating complex design endeavors – especially in contexts that are digitally intensive and depend nearly totally on virtual forms of organizing.

Theoretical Framing

In this section I will formulate a theoretical language that will be used to address the research problems stated above. First, I will introduce the Information Processing View

(IPV). This view assumes that developers build routine structures of varying kinds when they face differing environments. Second, I will briefly explore two forms of social structuring in OSS projects – relational and routinized structures. Here I will observe opportunities to contribute to theorizing through expanding how routine structuring in

OSS is likely to emerge. Once this framework has been established I will formulate specific research questions that will be tackled by the individual studies of this dissertation.

The Information Processing View To furnish a theoretical account of why and how OSS development patterns come to vary, I draw upon Galbraith’s (1973, 1974) well-known Information Processing View

(IPV). Per this view, organizations constantly face information processing needs (i.e. heterogeneous pieces of information that needs to be collated, interrelated, and acted upon) of varying degrees that stem from environmental variation and related pressures.

The amount of information to be processed is a function of the task complexity, i.e. the number and range of tasks, their interdependencies, and the resulting information content.

To respond to such task complexity, managers can use alternative organizational structuring mechanisms: hierarchies, goals, and programs (March & Simon, 1958) to create related organizational strategies, such as self-contained tasks (to decrease

19

information processing needs), or lateral relations (to increase capacities for processing information). The two central constructs of the IPV – information processing needs and matching strategies for processing information – can help us understand the specific ways in which OSS developers can respond to software development problems of varying difficulty under differing environmental conditions.

In general, information processing needs of OSS projects are determined by the current status of the codebase and associated design information expressed in an incoming flow of requirements (Scacchi, 2009). The latter are captured in bug trackers, roadmaps, blog posts, and conference presentations, among others. The status of the codebase (such as its variety and interdependence, cf. Jason, Ramasubbu, Tschang, & Sambamurthy, 2013) also influences the nature of development problems. For example, highly complex codebases tend to generate problems and tasks which are more complex in nature, meaning that the uncertainty related to problem solving efforts tends to shift as a new solution is attempted. Overall, specific characteristics of faced ‘problems’ – ways in which the codebase needs to be changed based on the present set of requirements – establish information processing needs to which the OSS project and its attendant structures must be adapted to.

To show how organizations generally respond to information processing needs, Galbraith proposes a number of (information processing) structures, such as lateral relationship structures (e.g. taskforces) and vertical forms (e.g. information systems), both of which increase information processing capacities. Additionally, mechanisms such as “self- contained tasks”, which essentially is a way of modularizing work processes, help minimize the amount of information that needs to be processed. These strategies entail

20

selecting and building various structures, which are then systematically configured to address faced information processing needs with differing characteristics and magnitudes.

In the context of formal organizing such structures are intentionally built using managerial decree. In the OSS context, however, this is difficult because of volunteer nature of participation and high level of distribution of work. In this dissertation I will propose, based on empirical inquiry into the work practices of OSS developers, that the idea of rationalities (Simon, 1978, 1986, 1996) – denoting a class of approaches as to what is considered an appropriate problem solving approach and its attendant heuristics – emerge through developer interactions. They then serve as cognitive guideposts for subsequent structuring of development activities. Characteristics of rationalities, such as which parts of the problem-solving process they focus on and how their legitimacy is established, lead to differently shaped routines and thereby also to different information processing structures.

To address development problems that stem from a codebase possessing certain technical characteristics, information processing structures and related needs must be aligned.

Here, Ashby’s law of requisite variety (Ashby, 1956) is a central notion that underlies the concept of ‘fit’ or ‘alignment’ associated with IPV. This implies that an OSS project is likely to be successful when it is capable of building information processing structures that match the faced task complexity posed by the current state of the codebase and incoming requirements (i.e. information processing needs). Such structures must exhibit similar (requisite) variety that matches the degree of variety of the tasks (Venkatraman &

Camillus, 1984; Venkatraman, 1989). Accordingly, OSS developers working in different development contexts characterized by different statuses of codebases and incoming

21

requirements flows need to adopt alternative information processing structures that can provide sufficient levels of variety.

Next, an important question must be asked: given the unique way of organizing in OSS, what kind of structures do OSS developers need to build to match faced task complexity?

To answer this question I identify two forms of structures that are available to OSS developers: relational and routinized structures.

Social Structure in Organizations1 Although various notions of ‘structure’ have been fixtures of social theory since the seminal works of Durkheim, Marx, and Weber, it is specifically within the functionalist view of Parsons and Merton where the contemporary usage of the term “social structure” gained its footing (Giddens, 1979). Both Parsons (1960) and Merton (1957) conceived of human activity in a context of nested social systems. According to Parsons, a social system is a ‘collectivity’ of humans interacting with each other in particular roles – and these roles are fundamental to the collectivity (examples of collectivities include organizations and societies). Parsons distinguished between collectivities and institutions: the former comprise the elements of the system, whereas the latter are “generalized patterns of norms which define categories of prescribed, permitted and prohibited behavior in social relationships” (Parsons, 1960: 177). Institutions embody the cultural goals, values, and prescriptions (scripts) for appropriate action (Merton, 1957; Parsons,

1960) and stand as regulations for the “allowable procedures” in the system (Merton,

1 This section is loosely based on Lindberg, A., Gaskin, J., Berente, N., Lyytinen, K., and Yoo, Y. 2013. “Computational Approaches for Analyzing Latent Social Structures in Open Source Organizing,” in Proceedings of the Thirty Fourth International Conference on Information Systems, Milan, Italy pp. 1–19.

22

1957: 133). As such, institutions go hand-in-hand with collectivities and define appropriate sequences of activities for the various roles in the system for situations that the system might face. Thus, taken together, according to a functionalist view, social systems have two elements: 1) the relations between elements of the system – formal roles of people and workflow-based interactions, and 2) the routine elements of the system – defined by goals, values, and institutions associated with proper courses of action in the context of the system. This fundamental distinction between the two structuring aspects of a social system – its relational structure and its rule-based routine structure – is one of the foundational distinctions in contemporary theorizing of organizations.

Since early organizational scholarship, there has been an increasing realization that institutions are not somehow enclosed within particular organizational contexts, but instead transcend organizational activity in both time and space (Giddens, 1979).

Institutions are “the enduring features of social life” (Giddens, 1984: 24). Institutional norms, values, scripts, and related typifications exist outside, are prior to, and result from organizational activity while being shared across social systems and fields (DiMaggio &

Powell, 1983). Organizational actors therefore draw upon a broader institutional order for the rules and resources that they continually reproduce (Giddens, 1984). Routine structures transcend any one organization and are therefore, although related to relational structures, fundamentally different. Hence, I distinguish between two forms of structure that jointly comprise social structure: the relational structure which involves the structural elements of a system and their relations to each other; and the routine structure which involves the rule/habit based patterns of activity in a system.

23

In addition to recognizing differing forms of structure, it is important to note that each can involve either manifest or latent dimensions. Merton (1957) indicated that systems have ‘manifest’ functions – those that are conscious and deliberate. He also described a dysfunctional subtext within systems that he named ‘latent’ functions – those that are unconscious and unplanned (see Table 1). Thus, organizational structures may be manifest or latent. Manifest structures are made explicit by being planned out, and documented and thereby made available to concerned actors. For relational structures, this could involve the formal organizational structure (Mintzberg, 1979), or interorganizational arrangements such as alliances (Powell, 1990). For routinized structures, the manifest dimension involves explicit written norms and rules expressed in ostensive aspects of organizational routines such as specifications of business processes

(Pentland & Rueter, 1994), or related institutionalized scripts for appropriate action, e.g. preferred ways of hiring faculty (DiMaggio & Powell, 1991). Latent structures, on the other hand, are not readily available for either organizational participants or researchers and represent the informal, organic, and unplanned structures that emerge during action.

They are not necessarily dysfunctional, as Merton suggested, but in contrast, they often benefit organizations in specific circumstances (Blau & Scott, 1962). For relational social structures, latent forms include informal interpersonal networks in which organizational actors are embedded (Granovetter, 1985); latent routinized structures involve performative or habit-based behavior – the enactment of ‘actual’ organizational routines

(Feldman & Pentland, 2003) rife with improvisation and bricolage which all can be described in terms of enacted institutional practice (Thornton, Ocasio, & Lounsbury,

2012).

24

The manifest forms of structure are generally accessible through classic methods of social inquiry since they are generally well represented. Latent structures are challenging since they are not readily accessible and cannot easily be made explicit to any social actor observing the social system. To reveal them we need trace data to find out who interacts with whom, when who did what, and then based on such observations determine recurrent patterns. In both tasks we need ample computational resources to aid the researcher in recording traces and identifying patterns and revealing the enacted latent structure. Before digitalization obtaining such data was extremely difficult or time consuming to gather. Now, with extensive digitalization of work and interactions, such data has become more abundant (Lazer et al., 2009). Next I will briefly address each form of latent structure and describe how it specifically relates to the information processing tasks inherent in the coordination of OSS development.

Table 1. Four Dimensions of Organizational Social Structure Example Definition Example Units of Analysis Citations Mintzberg, Manifest Relational Structure: Organizational chart and 1979; Powell, Arrangements for human interaction inter-organizational forms Relational 1990 Structure Latent Relational Structure: Granovetter, Informal social networks 1985; Burt, Regularities of human interaction 2000 Nelson & Written ostensive Manifest Routinized Structure: Winter, 1982; organizational routines / DiMaggio & Prescriptions for human action institutional scripts Powell, 1991 Routinized Structure Feldman & Latent Routinized Structure: Performative organizational Pentland, Repeatedly enacted rules and routines / institutional 2003; resources for human action practices Thornton et al., 2012

Relational Social Structure In the 1950s and 1960s, a number of organizational scholars noted that analyses of phenomena using categories of groups or classes, were seriously confounded by overlaps

25

and interconnections with what at first had seemed to be easily separable groups of individuals (Wellman & Berkowitz, 1988). Certain ‘rebellious’ scholars searched for alternative, empirically-driven perspectives for capturing the latent relations between individuals without the need for theorizing groups ex ante – groups that may not be as relevant as underlying social bonds between individuals (Granovetter, 1990).

The relational view of social structure was born; its conception of interaction relationships as the foundational unit of social structure began to take form with a somewhat alternative set of assumptions compared to the mainstream of social research.

Rather than looking at individual actors and their characteristics, this wave of research analyzed relations among the actors as the unit of analysis (Coleman, 1958) and created a decisive break from the current sociological and socio-psychological research. It essentially argued that individual’s behaviors are determined by the context an actor is embedded in and not by individual characteristics. The core concepts associated with relational structure involve thinking of organizations in terms of networks of nodes that have ties of some form with each other (Monge & Contractor, 2003). A node can be any actor (even non-human actors), and a tie can be any type of interaction or association between actors including forms of communication or social association. Ties can have a variety of characteristics including strength, direction, stability, etc. (Monge & Contractor,

2003). Using this basic theoretical lexicon, relational theorists can identify common patterns that characterize organizational networks, such as power-law degree distributions, transitivity, and embeddedness.

Relational structures are helpful for understanding the information processing activities, which are involved in the coordination of task interdependencies, because they explicitly

26

show how individuals work together (or not). Hence, relational networks can be seen as conduits through which information flows, thus enabling individuals to coordinate effectively with each other. As such, relational structures provide us with analytical opportunities to see which kinds of relational patterns are conducive for effective coordination. For example, formal organizations are based on establishing specific hierarchies that facilitate specific flows of information, intended to a) ensure that the right pieces of information get to the right people, and b) reduce the amount of noise

(irrelevant information) that individuals receive.

Our current concern, however, software development, is typically conceptualized and organized in terms of ‘methodologies’ – essentially procedural, rather than relational, strategies for processing information, such as agile and waterfall approaches (e.g. Vidgen and Wang 2009). Such patterned sequences of action for processing information are manifested in routines (Jacobides & Winter, 2012; Teece, Pisano, & Shuen, 1997), which are repeated patterns of action distributed across the community (Feldman & Pentland,

2003), geared towards achieving specific organizational goals – such as triaging bugs, inquiring into difficult technical problems, or adding new features. Routines have also been argued to embody “best practices” (Eisenhardt & Martin, 2000) or organizational memory (Nelson & Winter, 1982). Therefore, routines can help to explain the superior or mediocre performance of specific organized forms of OSS. In this context, routines execute the information processing work inherent in software development – i.e. solving technology-related design and implementation problems, such as fixing bugs or soliciting and implementing user-requested features. Hence, routines provide a rigorous

27

conceptualization of temporally unfolding social structures, a conceptualization which I will explore next.

Routinized Social Structure In organizational contexts, latent social structure is encoded not only in networks, but also in routines that are continually reenacted (Nelson & Winter, 1982). Such routines are on the one hand stable and persistent, but on the other hand they are also a source of flexibility and change (Dionysiou & Tsoukas, 2013). Routines are the locus for a wealth of an organization’s knowledge; within routines – and the meta-routines through which they change – rest also the dynamic capabilities that enable organizations to adapt to varying circumstances (Eisenhardt & Martin, 2000). One can also distinguish between the ostensive (manifest) and performative (latent) dimensions of organizational routines

(Feldman & Pentland, 2003).

Hence, routinized structures provide developers with templates, which guide the structuring of action. Thus, rather than providing conduits for the transmission of information, routinized structures provide guidance to developers for how to process information. For example a generic social more within contemporary OSS projects may specify that in order to change the codebase of a project you must first report a bug, then provide code to squash (i.e. solve) the bug, after which this code is discussed and revised to fit contextually determined coding standards, leading up to a decision of whether to accept or reject this particular piece of code. Such standards provide guidance as to what actions are appropriate given a particular history of activity. Stringing together actions in this particular manner then achieves the function of collating, integrating, processing, and acting upon relevant pieces of information.

28

A particular tension in routines that has been explored extensively is that between variation and stability (Feldman, 2003). Routines seem to be a source of both stability

(their traditional goal), but also capable of generating significant amounts of contextual adaptation and improvisation (Feldman & Pentland, 2003). I note three key elements in studying such patterns: 1) ordering, 2) rhythm, and 3) variety (Table 2).

Table 2. Key Aspects of Routinized Structure in Open Source Organizing

Definition / Original Findings in open OSS citations Description citations source organizing von Krogh et al., 2003; Specific ordering of Feature requests and Abbott & Christley & Ordering events has contextual bugs tend to initiate Hrycak, 1990 Madey, 2007; significance action sequences Crowston & Scozzi, 2004 Growth in Capiluppi, communities, 2004; Capiluppi Ancona & contributions, and et al., 2005; Grooved and Chong, 1992; codebases are Robles et al., Rhythm entrained sequences Gersick & entrained to each 2005; Koch, of events Hackman, other; growth is 2005; Feller & 1990 either constant or Fitzgerald, punctuated 2000 Lindberg, 2013; Variety of activities Campbell, Lindberg & Variation in event make up the Variety 1988; Wood, Berente, 2014; types ‘lifeblood’ of OSS 1986 Monteiro & practices Østerlie, 2004

Ordering: Some scholars have explored particular ways in which activities become staggered across time – their ordering. For example, von Krogh et al. (2003) suggest that different projects have different “joining scripts” that specify a typical ordering of activity suitable for entering a certain development community. Such joining scripts typically start with subscribing to a mailing list to get access to project communications, after which suggestions of bug fixes or new features supported by actual code are made.

When such a script is followed, joining developers were more successful in being granted

29

access to a developer community. Hence, institutions both common to the OSS community at large as well as local to specific communities, tend to favor a certain ordering of activity as the preferred way of conducting software development. Christley

& Madey (2007) find that distinct activity sequences seem to be initiated by forum postings, bug reports, or feature requests. Such a sequencing of micro-procedures provide us with a view of the generic structuring of procedures over time and thus portrays organizing not just as configuring relationships, but also as configuring activity elements staggered in time. It also mirrors the observations made by Raymond (2001) that the initiative towards continued development comes from the community periphery in the form of bug reports, feature requests, and suggested fixes. Crowston and Scozzi (2004) examined bug fixing procedures and suggested that while bug fixing lacks a formal coordination process, it still consists of a typical sequence of activities: a bug is submitted, then analyzed, fixed, and eventually closed since it is no longer a concern.

While the ordering of such routine performances is facilitated by technical functionalities in a version control system or a bug tracker, the ordering of activities also shows how certain rationalities, ways of approaching design tasks, express themselves in the routine structures of OSS projects.

Rhythm: Another important concern has been to establish the overall rhythm through which project grows. In studies of OSS, this has often been expressed through attempts at establishing typical growth rates of various projects. It has been argued that the size of the codebase and the number of developers grow at similar rates (Capiluppi, 2004), that most OSS projects grow at a linear rate, although there are exceptions, such as the Linux

30

kernel which has grown at a super-linear rate (Robles et al., 2005), and others have been found to grow at a quadratic rate (Koch, 2005).

The massively parallel nature of OSS development (Feller & Fitzgerald, 2000) means that while activity levels at different times and across different components of a project may vary, they are likely to be associated with each other through processes of entrainment (Capiluppi, 2004). Such grooving of activities create periods of distinct activity levels and types, often leading to punctuated equilibria – distinct periods of time were a project moves at a uniform rate. Hence, understanding the structure of procedures as it is expressed in somewhat stable patterns of recurring activities help us to understand the characteristics of activities in a certain period or project.

Variety: The types of activity, which unfolds in OSS development, shows substantial variation – meaning that the types of activities and the ways in which they are ordering are not necessarily uniform. Such variety has been used in other contexts to explore the impact on performance in design processes, as well as relationships to digitalization in design processes. In OSS, variety of activity types can be argued to represent an important driver of the writing of functioning code (Lindberg, 2013; Lindberg & Berente,

2014). The variety of activities make up the “life blood” (Monteiro & Østerlie, 2004) of any OSS community. OSS activities unfold to fulfill specific organizational, ritual, and technical goals. Through these activity patterns, the community is maintained as the emergent codebase grows while existing ones are being pruned and refined.

In conclusion, various properties of routine structures such as their ordering, rhythm, and variety are key to understand the specific roles that such structures play in the dynamic unfolding of OSS projects. Through understanding how specifically configured routine

31

structures emerge, evolve, and come to vary systematically with various factors, we can begin to explore exactly how developers build various structures which are able to process the information inherent in the challenges that they are facing.

Research Questions

Given that I above have formulated a theoretical language to identify and analyze information processing inherent in development patterns, or routine structures, I will now start to inquire into the role of routine structures in the context of OSS development. The specific questions that I will ask are intended to poke at the origin, evolution, and variation of such structures. These are basic forms of questions in the social sciences with regards to any newly conceptualized phenomenon or entity, and address basic concerns such as why and how a particular thing comes into existence in the first place, how it changes over time, and how it may vary across different instances or in different contexts.

Additionally, beyond superficial understanding of certain isolated aspects of OSS development processes such as community joining (e.g. von Krogh et al., 2003) or generic patterns such as growth rates (e.g. Qureshi & Fang, 2010), we know little about how routine structures emerge, evolve, and vary across OSS projects. Hence, I ask:

a) What is the origin of routine structures and what functions do they serve in OSS development? b) How do routine structures change and evolve over time in OSS development? c) How do routine structures vary across OSS projects? To answer these questions, I will next craft a research design and examine the nature of the data that this dissertation will be leveraging, as well as the various approaches to analyzing this data.

32

Research Design

Traditionally studies that attempt to identify patterns of behavior (such as routines) in terms of their origin, evolution, and variance seek to identify underlying generative mechanisms that drive processes forward. Typically they explain observed outcomes as necessary or sufficient conditions (Ragin, 1987). Traditionally such patterns of behaviors and their underlying generative mechanisms have been captured by ‘zooming in’ using qualitative methods armed with inductive, interpretative reasoning. Such analyses help reveal how specific situational and micro-level ‘forces’ among individuals and their interactions generate outcomes (Langley, 1999). Recently, advanced computational alternatives have emerged which help scholar ‘zoom out’ from specific situations and analyze broader patterns of such contextually situated micro-behaviors. Scholars such as

Abbott (1992) have proposed a “narrative positivism” for analyzing sequence of events as patterns. Other scholars, such as Ragin (1987), have proposed more scalable forms of identifying necessary and sufficient conditions. Both use computational methods, either adopted from DNA sequencing or Boolean algebra.

Overall, both approaches focus on capturing and analyzing larger patterns of behaviors or conditions so as to observe specific combinations of states under which certain outcomes are likely to emerge. Thereby they offer ways to capture underlying regularities in behavior and events. This dissertation is interested in the inherent variation in observed structures of work activity as recorded on OSS development platforms. Therefore it will use sequence-analytic techniques developed for the social sciences by a number of sociologists (Abbott, 1990, 1992; Gabadinho, Ritschard, & Studer, 2011), based on approaches originally deployed in genetic studies within computational biology. By using

33

sequence analytic approaches I can elicit latent behavioral patterns within OSS work occurring over time. To conduct such analyses I can use samples of digital trace data (e.g.

Anjewierden & Efimova, 2006) to detect common behavioral patterns over extended periods of time. These techniques allow scholars to identify primitive patterns (behaviors or events) that are repeated in the data set, and in what forms of regular patterns, if any, these primitives recur. Therefore, I can start with an observed pattern of activities and analyze their ordering, and eventually theorize the underlying generative mechanisms that give rise to these patterns.

A basic premise of this dissertation is that understanding such patterned practices, i.e. the latent structures of routine performances, requires us to understand both global patterns as well as the local activities from which those patterns emerge. This premise, in combination with the unique forms of data that the OSS context offers, means that the form of inquiry to be conducted leans towards a mixed methods design. Such methods

“uses quantitative and qualitative research methods, either concurrently (i.e., independent of each other) or sequentially (e.g., findings from one approach inform the other), to understand a phenomenon of interest.” (Venkatesh et al., 2013: 3). This usually implies specific combinations of quantitative and qualitative research, thus allowing for theoretical benefits such as triangulation, integration, and increased richness of evidence

(Johnson, Onwuegbuzie, & Turner, 2007; Onwuegbuzie & Collins, 2007).

With the emergence of digital trace data and computational methods of inquiry (i.e. machine learning algorithms based on inductive assumptions) new opportunities for mixed methods studies are opened up. Based on the shared inductive assumptions

(Holland, Holyoak, Nisbett, & Thagard, 1989) of computational analyses and qualitative

34

inquiry, flexible and versatile integration of quantitative and qualitative research methods can be achieved. In this dissertation I combine these approaches to fashion a set of exploratory and embedded mixed methods studies. By the term ‘embedded’ I indicate the constant triangulation between qualitative and quantitative evidence within each study.

The main purpose of quantitative evidence in this approach is to provide rigorous and high level descriptions and facts of macro-level entities such as “social structure”, whereas qualitative evidence is mainly used to provide a rich account of local, meaningful, idiosyncratic action. Through theoretically and empirically embedding meaningful local action within larger structural patterns related to social behaviors I forge a tighter linkage between two persistent features of the social world – structure and agency (Giddens, 1984).

Increasing computational power and storage capacity has recently enabled not only the rise of computational social science as a set of techniques for analyzing data, but also created datasets that exhibit characteristics that are different from traditional psychometric data or econometric data collected through surveys or tracing social system indicators such as prices in stock markets or yearly earnings of firms. These new sources of data and the complementary computational power (Watts, 2007) to analyze them has recently been heralded as novel conditions which will "revolutionize our understanding of collective human behavior" (Watts, 2007: 489). While the information afforded by the data is limited (i.e. it only pertains to a restricted scope of human activities), the data itself is often highly fine-grained, has a very large population size, often overwhelms traditional analysis tools (hence the moniker "big data"), and is often longitudinal due to

35

being recorded in real-time. These characteristics pose new challenges and open up new opportunities.

Social phenomena are often emergent, meaning that structures are generated by recursive interactions between local agents. Such dynamics unfold across time, and are interconnected with multiple overlapping social systems, each exhibiting their own dynamics. This overwhelming complexity has often led to disappointing results of limited scope in the social sciences. The emergence of digital trace data capturing human interaction in real time and the complementary computational capacities to model such data with high fidelity offers new venues for social science research to pursue. Such venues can potentially overcome some difficulties in understanding relationships between local agency and persistent social structures that have been experienced in the past.

The research design followed in this seeks to seize some of these new opportunities. It employs computational methods based on inductive identification of patterns, and qualitative methods, based on interpretation of texts. While the types of analyses in each study are different, the common denominator is the unit of analysis – routine structures – and their connection to varying forms of information processing. The studies seek to pursue the triangulation, complementarity and developmental goals of mixed method studies (Venkatesh et al., 2013). The overall analysis process (depicted in Figure 1) starts with first processing digital trace data using sequence analysis, so that computational techniques such as cluster analysis can be employed (to identify quantitative patterns).

This serves as an approach to answer questions with regards to the patterned nature of activities – effectively eliciting the contours of social structures, which are active in the particular OSS projects I studied. Subsequently, qualitative inquiry is conducted into the

36

overall context to understand the local generative mechanisms behind observed routine patterns (e.g. contextual conditions and drivers). This facilitates inquiry into the generic cognitive and normative patterns, which characterize a specific OSS project.

Additionally, related performances of routines within each cluster are examined using qualitative interpretative analysis to ascertain their substantive meaning and purpose (as expressed in text).

Figure 1. Research Process Once all of these elements are in place I zoom in and out across quantitative patterns, context, and text to achieve conceptual congruity across global patterns and local action

(Gaskin, Berente, Lyytinen, & Yoo, 2014). Based on this I seek to theorize about and validate a specific set of generative mechanisms which abductively (Gregor, 2009;

Locke, Golden-Biddle, & Feldman, 2008; Vaast & Walsham, 2011; Zachariadis, Scott, &

Barrett, 2013) seem likely to produce the quantitative patterns that have been observed.

In the sections that follow I will explain the research design in detail. First I will provide an overview of the unique features and implications of digital trace data in general. Then

I will explain how the specific data used for this dissertation was sampled and collected.

37

Subsequently I will explain how both computational and qualitative methods were used to analyze the data, and last I will explain how the inferences made were validated.

Characteristics of Digital Trace Data In order to be able to conduct the mixed method analyses outlined in Figure 1, a special dataset is needed. I need to obtain data that can be represented as a) structured, formalized patterns (e.g. enabling measures of event types, timing, sequencing, etc.) and b) qualitative information (i.e. texts, interviews, and contextual data) that provides the meaning of the situations captured by the structured quantitative patterns. Typically, such process data is often walled of in corporate gardens, and is therefore generally unavailable. However, the transparent nature of the OSS context offers me unique access to data that satisfies both criteria. Granted that such data is available, we need to grapple with the unique nature of such data. Therefore, in this section I will lay out the problems related to using such unique forms of data.

If we can collect OSS development data that is fine-grained, shows relationships, has temporal duration, and specifies the nature of events and activities, we can lay the groundwork for understanding social interactions taking place during OSS development at both fine levels of granularity as well as with the vastness of scope that makes both zooming in and zooming out possible. Most importantly, such data helps us understand social behaviors in a manner which departs from exclusive focus on linear covariate relationships that dominate social inquiry. This is achieved by integrating multiple forms of data in an explicit inquiry into dynamics and temporal patterning.

However, utilizing automatically generated, computationally collected data comes with two major problems: incompleteness and fragmentation (Bird, Rigby, & Barr, 2009).

38

First, the data is often incomplete, because of a number of reasons: changing hosting platforms and/or repositories for the same project, breakdowns in data storage (lost data), as well as changing database schemes that only allow certain data to be captured for certain time periods. Second, the data is often fragmented – this means that you cannot get a complete picture of all the activities of a project using this type of data. While version control platforms, such as Github (which is used for this dissertation) offers a wealth of development and collaboration affordances for their users, it is not the only tool or forum used for development and collaboration. In practice, OSS developers also use email, Twitter, IRC, Skype, face-to-face meetings, as well as other tools for coding, project management, and communication. Github, however, only capture the outputs of this process; the code and thoughts that developers want to contribute to the community at large.

Together, the incompleteness and fragmentation of available datasets creates a certain level of skepticism and carefulness in the conclusions that can be drawn from such data.

Therefore, we need to carefully remember that I cannot represent all the interactions in an

OSS project, only a number of key interactions that, fortunately, are closely associated to the activity of producing the actual code. Therefore we can imagine that there are other processes that are antecedent to the activities that I am considering. However, the fact that my units of measurement might have unobservable antecedents, does not obscure the linkages that I am trying to uncover between what I can measure in terms of code production (e.g. pull requests, commits, and discussion comments), the structures deduced from such data, and their potential associations with theoretical constructs such as information processing structures.

39

The digital trace component of the dataset consists of time stamped data on interactions between developers (and artifacts). However, since the dataset contains detailed traces of behaviors of individual developers in relation to the full set of artifacts involved as well as fellow co-developers, there are many different ways of parsing the dataset. My parsing is based on the conceptual framework that captures the routine structures of coding practices. In order to complement the thin, but wide, data afforded by digital traces, I am also collecting interview data to allow for thicker, more contextualized understanding of activities. Due to the public nature of OSS, there are also numerous blog posts, mailing lists, public interviews, presentations, and workshops stored in text, video, or audio format. These have also served as important sources of qualitative description of the context of each project.

Below, I will explain the dual nature (qualitative and quantitative) of digital trace data and discuss the epistemological implications of such data – what we can and cannot learn from this type of data. Then I will discuss the process through which these digital traces are created in my case by exemplifying the process of creating digital traces on the

Github VCS platform.

The Dual Nature of Digital Trace Data Because routine structures have measurable quantities associated with them (e.g. the overall heterogeneity of routine structures) they are amenable for quantitative analysis.

However, the same structures are often (but not always) amenable for grounded qualitative inquiry. For example, on OSS hosting platforms such as Github, the same data which can be extracted to elicit routine structures (i.e. coherent streams of actions) can also be read as text through simply looking at the flow of dialogue and code recorded in

40

the course of reporting bugs and writing code. Further, such digital traces are embedded in a context containing cognitive and normative patterning, which can be analyzed through examining qualitative data in the form of interviews and other publicly available narrative documents (see Figure 2).

Figure 2. The Dual Nature of Digital Trace Data Therefore, the ‘thin’ (Geertz, 2005) characteristics of quantitatively analyzable social structure can be complemented with the ‘thick’ characteristics out of which we can induce meaning, cognition, , discourse, as well as normative and power-related concerns. In this sense, routine structures can be analyzed both in terms of their nomothetic and idiographic characteristics (Windelband, 1904). They are nomothetic because they are generalizable (Gaskin et al., 2014), but they are also idiographic because they differ in the local meanings and interpretations which focal actors attach to them.

This suggests that the usage of qualitative methods as a way of supporting computational inquiry will be helpful to zoom in to the specifics of certain structures, as well as zooming out to see the larger, more generalizable features of said structures.

41

Datasets with these characteristics can, in my case, be generated from traces of users’ interaction with technical platforms, such as Version Control Systems (VCS), work flow management tools, scientific computing clusters, or hedonic systems (e.g. online multiplayer games). As users execute actions afforded by such technical platforms, traces of these actions are recorded on the platform. In order to assess the validity of such data, and in turn properly bound the conclusions that can be drawn, we need to understand the characteristics of the socio-technical process that generates such data and therefore both analyze the design and meaning of the affordances related to these platforms and how they are enacted in specific situations.

Epistemological characteristics of Digital Trace Data While the information generated by digital traces exhibits fine granularity, it is also ‘thin’

(Folger & Turillo, 1999). The thin nature of the data means that it is purely behavioral and void of references to intentions or psychological states (therefore, I will augment my construct development with an inquiry into ‘thick’ qualitative data to ensure that interpretations of digital traces are backed up by emic accounts (i.e. from the perspective of the study participants) of development practices which convey the lived experience of participating in developer communities). The strengths of digital trace data is its massive size (both in terms of population and temporal duration) and lack of respondent bias due to memory effects or social desirability. While the massive size is definitely a benefit, it also creates a number of computational challenges.

Digital trace data does not conform to traditional assumptions of either psychometric or econometric data. Rather, what we have is detailed behavioral data for a large amount of individuals over long periods of time. The data is constrained by the functionality that is

42

provided by platform designers and the parts of the data that are recorded and then selected for analysis by researchers. However, users do not bias the data, since their actual behavior is recorded by the platform. The only way for users to ‘bias’ the data is to not interact with the platform or to ‘game’ the platform, since all activities conducted on the platform are faithfully recorded, even if such traces sometimes can be devoid of intention and other underlying psychological processes.

To use the data to its full effect, we need to understand the epistemological implications of behavioral interaction recorded. In order to do this I will draw on Dewey's pragmatism which posits action as a primary ontological unit (1938). For Dewey the veracity of a statement is not judged by some imagined correspondence with ‘reality’, rather veracity is equivalent to efficacy. Therefore, action and inquiry overlap to a significant degree.

Human beings interface with the world, not through disembodied cognition, but through contextually embedded performance of actions. The way that problems are discovered, solutions negotiated, and progress occurs in an OSS project is therefore through action. In summary, pragmatism gives a suitable philosophical foundation for dealing with the kind of data that we have at hand.

Further, social theory on structure emphasizes the relationship between social structures and agency – action that makes a difference (Emirbayer & Mische, 1998). Therefore, while the data itself is purely behavioral – traces of individual actions – the patterns that can be elicited through analyzing large amounts of such data can help us to understand various social structures. These social structures lay dormant in the patterning of large amounts of activities.

43

Hence, when analyzing digital trace data we are limited to observing the actions of OSS developers – essentially etic (i.e. from the perspective of the researcher) accounts of interaction. Therefore, my analysis needs to be supported by qualitative inquiry that can access the emic accounts of developers participating in coding practices and communities. Through such inquiry I can ensure that the ways in which I interpret digital traces is grounded in the experiences of developers, and therefore related to latent structures of development practices and developer communities.

The type of research that emerges from these epistemological implications and their potential is what Pollock & Williams (2008) call third-wave studies – research that examines both rich contexts and abstracted structures. Through rich accounts of local processes executed by developers we can understand the contextualized practice of information processing in OSS. Simultaneously, complete digital trace data allows us to use quantitative techniques to elicit the latent patterns that make up the software development procedures of each project. Hence, we can observe the global structures in my data. Through theorizing we can thus connect ‘thick’, localized accounts of agency, with ‘thin’, abstracted accounts of social structures.

As we have seen above, the detailed data offers opportunities for discovering social structures of various kinds. In order to do this, the data often needs to be transformed extensively, and subsequently novel analytical techniques need to be applied to elicit latent structures. Therefore, what trace data lacks in terms of fine-grained psychometrics, it compensates through providing rich details on behavior. This situation demands more of the analytical techniques than high-quality psychometric or econometric data would.

This type of data can often be analyzed using simple regressions to illustrate various

44

causal relationships. However, in order to identify constructs and their relationships from data that is essentially activity logs, one needs to apply analytical techniques that elicit latent constructs representing the various patterns that lies dormant in the data. Hence, the mode of analysis that this type of data calls for is mainly inductive (Holland et al., 1989) and rests on computational techniques that allow us to understand the routine structures in terms of structural variation.

Digital Trace Data at Github The digital traces used in this study are generated and analyzed by three main groups of actors: platform designers, users, and researchers. Below I will explain how each of these groups takes part in the process of generating and analyzing digital traces (see Figure 3 below). After a generic overview has been given, I will delve into the details with regards to how digital traces are treated on the Github platform.

Figure 3. The Process of Generating & Analyzing Digital Traces Platform designers specify both the general idealized affordances – design actions that the platform makes possible (Gibson, 1977), as well as what information gets recorded when those actions are performed. The information that is stored is limited to what is

45

supplied by users, and to what information is created through interactions by users when they enact platform affordances. Hence, the ways in which platform designers have set up affordances and mechanisms for recording information based on the activation of those affordances sets the boundaries for what types of data will be created and can be used by researchers to understand various forms of behavior and structures. For example, on the

Github platform there are extensive affordances for editing and sharing code and related conversations.

Users are actors that engage with the platform because its affordances are useful for them to achieve their goals. Therefore, the particular goals and the ways in which users try to achieve them through activating certain sets of affordances in certain sequences (and also in relation to other users, given the increasing social nature of these platforms) shapes the particular digital traces left by that user. Therefore, the affordances utilized by individual users on the platform leave digital traces that tell the story of that particular user’s activity in relation to that platform. For example, when OSS developers interact with a project on Github, they leave specific traces based on the particular actions that they undertake. A developer making an edit to a shared file will, for example, leave a trace specifying who edited what file, at what time, and in what manner.

Researchers, trying to achieve certain scientific goals, choose parts of this dataset to be extracted for analysis based on notions derived from previous theorizing and qualitative inquiry into the meaning of certain digital traces. Based on the particular sections of the data extracted for analysis, researchers attempt to infer behavioral patterns. However, such inquiry is obviously bounded by the affordances specified by platform designers, as well as the activities performed by platform users. For example, in this dissertation I will

46

extract digital traces related to software development actions that OSS developers undertake.

This process essentially works to convert activities utilizing affordances to digital traces, which in turn are converted to empirical data to be used in research. As users exercise the affordances specified by platform designers specific digital traces are left. Subsets of these digital traces can then be extracted by researchers to be used as empirical data in scientific inquiry (see Table 3 below).

Table 3. Aspects of Digital Traces at Github

Aspect Description Examples Affordances Specific action possibilities which Github affords the submitting of bug reports platform designers provide to users Digital Traces When users exercise affordances, specific All comments related to code are recorded digital traces are left in discussion threads Empirical When researchers select specific subsets Specific details on who executed what Data of digital traces this can be used as action at what time in relation to which empirical material for research project can be extracted

Having considered the ways in which digital traces are generated on digital platforms in general, I will now consider how digital traces are generated at the Github platform specifically. This platform has been designed and developed by designers intending to create a service that helps the OSS community manage code, projects, as well as multi- contributor collaboration. Built on the git VCS which provides basic affordances for storing and managing multiple versions of code, Github has attempted to add a layer of social interaction affordances on top of git (Bird et al., 2009). Therefore Github does not only facilitate versioning of code, but also other parts of communication and collaboration necessary for a large community to work together on a common project.

47

The users on Github are mostly OSS developers who seek to develop and market their own projects, as well as contribute to the projects of others. These activities are motivated by fun (Luthiger, 2005), the desire to learn, as well as intentions of building a network and a positive reputation (Weber, 2005). In order to fulfill these goals users activate affordances for code versioning as well as for communication and collaboration. Through these activities digital traces are generated that detail the timestamped actions of a developer, as well as how these actions relate to artifacts and other users.

On Github, activity data is organized per pull requests. Pull requests are packages of suggested code changes in the form of multiple ‘commits’ (i.e. discrete code changes) organized in a coherent unit – code that a developer has merged into his own, local copy of the codebase, and now want the project owner to ‘pull’ into the common, community copy of the codebase. That means that each activity sequence starts with a pull request, to which one or several distinct pieces of code are attached, which then potentially are discussed and renegotiated before a decision is made to merge the code into the codebase or outright reject the pull request. Hence, my data represents distinct routine performances related to distinct problems. These may be conceptually connected into larger problem-solving routines, but the way that Github structures the data, each pull request is distinct and separate. The types of activities that are logged are specified by the

Github platform itself, and are thus emically defined (see Table 4 below), and contain activities such as opening and closing pull requests, merging written code into the overall codebase, as well as discussing the code in focal or related pull requests.

48

Table 4. Definitions of activity types

Activity Type Definition assigned A problem is assigned to a specific developer closed A pull request is closed discussed A discussion comment is made mentioned A specific commit is mentioned in a discussion comment merged A pull request is merged into the baseline copy of the code opened A pull request is opened/initiated referenced A pull request is referenced in another pull request reopened A closed pull request is reopened reviewed A specific snippet of code is commented upon

Researchers, like myself, can then extract these streams of interaction data in order to learn more about how activities unfold across time, and how it implicates networks of actors and artifacts in the process of designing and developing OSS projects. Here, I have made a conscious choice to focus on specific aspects of the work that occurs on Github.

Previous research (Howison & Crowston, 2014) has shown that a large fraction of code contributions are made by developers working alone. In this dissertation, the intention has been to focus on work of a more collaborative nature. This is captured in pull requests, which developers submit on the Github platform. Often, developers work through pull requests when they have an intention to get in touch with other developers to communicate with regards bugs or suggested code changes. Hence, my specific slicing of the digital trace data captures less well-studied aspects of OSS, which may give new insights into how developers collaborate using routines.

Sampling & Data Collection The version control platform Github were used as the empirical setting for the studies in this dissertation. This setting is relevant for a number of reasons. First, Github is

49

currently one of the premier places to host, develop, and promote OSS projects (e.g. http://www.wired.com/wiredenterprise/2012/02/github/). Second, the largest projects hosted at Github represent an important (yet understudied) mid-section within the larger population of OSS projects. This is so because the largest OSS projects (e.g. the Linux kernel and Apache Foundation) usually have their own individually tailored hosting and version control solutions. Therefore, the largest projects on a platform such as Github or

Sourceforge are somewhat representative of what could be labeled as an OSS mainstream

– large projects that attract communities of developers, but which still are not large enough to become self-sufficient, and thus still rely on hosting services such as Github.

Third, through using Github as a context I am enabled to collect large amounts of data that is automatically assembled using principles and techniques that are largely the same across each project – therefore ensuring data standardization.

To capture routine patterns existing latently in digital trace data I theoretically sampled

(Eisenhardt, 1989) four different OSS projects, all hosted on the Github OSS platform

(https://github.com/): Rails, Django, Rubinius, and Bootstrap. The sampled data covers a time period of roughly 12 months between January 2012 and January 20132. These projects were chosen because are similar across multiple dimensions, thus allowing us to control for their influence (Yin, 2008): 1) they are all related to web development, 2) they are well-known and have enjoyed similar degrees of success (English & Schweik, 2007;

Wiggins & Crowston, 2010). Further, 3) they are all of medium size (medium indicating that they are among the largest projects on Github, but still smaller than the giants of the

2 Study #2, with its focus on evolution over time, utilizes a dataset covering 28 months, for the Rails project only.

50

OSS world – Linux, Apache, Mozilla, etc.), 4) they have substantive communities of developers involved, 5) are mature by virtue of having multiple official releases, and 6) enjoy some form of corporate involvement.

Django (https://www.djangoproject.com/) and Rails (http://rubyonrails.org/) are full- stack web development frameworks, the former based on the Python programming language, and the latter based on Ruby. Rubinius (http://rubini.us/) is a Virtual Machine for the Ruby programming language, mostly written in Ruby, and is therefore a flagship project within the Ruby community. Last, Bootstrap (http://getbootstrap.com/) is the most popular project on Github (at the time of data collection it had 38,448 ‘watchers’, the most of all Github projects, i.e. developers who were paying attention to the project using

Github’s affordances), and it’s a frontend web development framework originally developed by employees at Twitter.

Because of the online, distributed nature of the projects, I was able to collect rich digital trace data (captured in Table 5 below). The vast majority of work took place on the

Github version control system. In order to extract the quantitative trace data on routine structures, several scripts were written based on the Github data mining toolkit by

Gousios & Spinellis (2012). These scripts capture every single activity in each project for the period under scrutiny. Hence, such data gives us a complete picture of the digital traces left by developers during the process of writing code.

51

Table 5. Sample Sizes Measure Django Rubinius Bootstrap Rails Total N (sequences)* 621 279 1,440 3,284 5,624 N (activities)* 3,004 2,620 4,420 39,850 49,894 N (interviews) 12 17 4 13 46 N (public audio) 1 3 1 11 16 N (public video) 17 3 3 10 33 N (public text) 9 8 18 10 45 * Sequences are activities stringed together by a common pull request/issue identification number (i.e. they all relate to the same software patch or bug report). These are also available as human-readable text including discussion comments and code. Interviews were conducted over Skype, as well as through attendance at industry conferences (e.g. DjangoCon). These interviews were largely semi-structured and guided by a template that was built iteratively as I gathered more experience with the participants. The final version of the interview protocol can be found in Appendix E. The overall goal of the interviews were for participants to describe their coding and collaboration practices as they unfold over time, as well as to provide insight into the contextual dynamics of each project which govern decisions such as which features to implement, what good coding style is, etc. Typical questions were of the kind “Please describe a recent contribution which you made to the project. What did you think and do during the process?” and “Please describe a situation where decisions with regards to features were made within the project. What happened?” Based on this template each interview was conducted in a slightly different manner, taking into account the form of each interviewee’s actual experience. Interviews ranged in duration from 45 minutes to 3 hours and were conducted with founders, corporate sponsors, core team members, regular contributors, as well as peripheral contributors of each project. Due to the differing levels of social complexity, saturation was reached at different rates. For example, Bootstrap

52

has no core team per se, and the two project owners make all decisions. Therefore, after interviewing one of these project owners, as well as three peripheral members, emergent categories were already stable. In more complex projects, such as Rubinius, where social structures are more differentiated, interviewing both founders, core team members, as well as ‘evangelists’ and other contributors of varying degrees of involvement required substantially larger numbers of interviews to achieve saturation.

Analyses Once this rich and multi-faceted dataset consisting of both quantitative and qualitative data had been collected, I set out to analyze the data using an array of methods. The main form of computational analysis used is sequence analysis, which allows for the elicitation of routine structures and their quantitative properties. To explain the contextual meaning as well as local significance of such global patterns, I also conduct qualitative inquiry based on a grounded theory approach. To achieve congruity across inferences with regards to both global patterns as well as local action I “zoom in and out” across multiple sources of evidence, and during this process I craft theoretical accounts of the generative mechanisms behind observed quantitative patterns. As such, this is a process where qualitative and computational inquiry goes hand in hand, and throughout the analysis process insights from analyzing one form of data cross-pollinated the analysis of the other form of data. First I will explain the main forms of computational analysis employed: sequence and cluster analysis, and then I will explain the grounded qualitative inquiry into both routines-as-text as well as contextual sources of data.

53

Computational Analysis With the availability of a unique dataset rich in qualitative content as well complex quantitative data, a tailored set of analyses will be necessary. As previously outlined, these analyses will consist of a tight integration of computational analyses and qualitative inquiry. Since there is a wide range of computational analysis techniques, which often are not well understood within mainstream IS and organizational research, I will first provide an overview of the computational techniques that will be used (sequence and cluster analyses). In the next section I will show how the qualitative inquiry will be conducted and how it integrates with the computational analyses.

Sequence analysis is heavily computational (Gabadinho, Ritschard, & Studer, 2011) and uses a finite set of categorical variables to express behaviors or events and their varying distributions across time. Sequence analysis has two primary functions: a) to detect sequential patterns of actions over time and b) to identify the alignment (or similarity) between those patterns. This happens typically through optimal matching techniques, which can be explained by considering the following two structured sequences typical in

OSS development:

1: open / comment / merge

2: open / comment / close

In order to measure the similarity in these activity patterns – the extent to which sequences align, or are similar – we need to estimate the effort required to transform one of the sequences into the other. In this example, we have to replace the ‘merge’ activity in sequence 1 with a ‘close’ activity to arrive at sequence 2. This simple replacement allows us to quantify the ‘cost’ of alignment, or distance between sequences, which in this

54

example is 1 (Abbott & Hrycak, 1990). Aligning sequences with more differences results in higher costs and distances. Higher costs mean the sequences are more different, lower costs indicate more similarity. In order to align sequences to each other, an optimal matching algorithm must be applied to sequence data (two sequences minimum) that measures the distances in terms of number of insertions, deletions and substitutions

(called indels) needed to transform one sequence into another. The total distance between two sequences is called an Optimal Matching distance (OM distance) and is a measure of dissimilarity – the degree to which two sequences are not similar to each other. In the context of OSS activity, computing OM distances enables us to describe the extent to which the sampled activity structures are similar or dissimilar. Once sequences are aligned and described, we can estimate the probability distributions of activity transitions.

Such transition probabilities are the key to detecting routine structure in the form of ordering and rhythm. Given such a set of transition probabilities, we can then describe the activities, ordering, and structure of a latent routine structure (Pentland, 1995).

Based on sequence analysis, I can next perform cluster analyses on the routine structure data. This form of analysis has previously been used to show how, for example, various implementation processes can be categorized into a taxonomy based on the structural commonalities among procedures (Sabherwal & Robey, 1993). Cluster analysis utilizes the OM distances between sequences to place structurally similar sequences in common clusters, separated from structurally dissimilar sequences.

To conduct the cluster analysis I utilized Ward’s method, which is an agglomerative hierarchical clustering method that strives to minimize the variance within each cluster.

This method does not require a pre-specified number of clusters, but rather produces

55

layers of increasing number of clusters until no further clustering is possible. Since the algorithm recommends no specific number of clusters per se, an evaluation of cluster quality must be conducted (Studer, 2013). This was done using two specific measures, each measuring a specific aspect of cluster quality: Hubert’s Gamma (HG) and Average

Silhouette Width (ASW). The former measures the capacity of the clustering to reproduce the original distance matrix as produced by an OM algorithm. The latter measures the “coherence of assignments” meaning high between-group distances and low within-group distances. HG tends to increase at a diminishing rate as the number of clusters increases, whereas ASW tends to peak at an optimal number of clusters and thereafter decrease. HG can therefore be evaluated in a similar manner as a scree plot in factor analysis – the optimal number of clusters can be found at a distinct bend or

“elbow” in the curve. After this point adding additional clusters will increase the HG statistic, but at an accelerating rate of diminishing parsimony.

Figure 4. Graphical representation of clustering

56

This yields clusters of sequences, which have similar structural properties: patterns of task types and their ordering. The output of a cluster solution is visualized in Figure 4 below. Here, the clusters are projected on a two-dimensional Euclidean space, where distances between dots represent OM distances between sequences. It is important to note that because it is a Euclidean space, the X- and Y-axes do not have direct interpretations.

Rather, the graph should be seen as a type of coordinate system, which allows the reader to visually estimate distances between observations (i.e. distinct routine enactments). In this particular example, two clusters are elicited. We can use the distance between dots within a cluster to visually gauge the internal heterogeneity within a cluster, and distances between dots from different clusters to estimate the heterogeneity across clusters.

When such clusters have been identified, I can randomly sample (random sampling is necessary to reduce the size of the dataset so that it can be hand-coded using qualitative methods) sequences from each cluster, and proceed to code them qualitatively so as to understand not only the ‘thin’ quantitative properties of information processing routines, but also the ‘thick’ context and meaning within which particular types of routines are embedded.

Qualitative Inquiry When the structures of routines have been identified using computational methods, the goal of the inquiry process is markedly clarified: how can I explain the particular patterns that have been identified? Using the overall theoretical framework of IPV to sensitize myself to the way in which information was processed across routines, I can proceed to code interviews and archival data for related themes. The process is iterative, and relies on using the digital traces of routine performances stored as text on the Github platform

57

to reach a detailed understanding of local activities. Simultaneously, I use interview data and other publicly available archival data (conference presentations, blog posts, etc.) to understand the context of local activities. Through zooming in and out across quantitative patterns, context, and text, a strong theoretical account of the processes that generate observed quantitative patterns, can be elicited. Below I will describe each of these aspects.

Beginning with archival data and interviews I coded for emergent themes. This helped me to arrive at an overall understanding of a) the OSS context, b) practices which are common to all (or most) projects hosted on Github, and c) the idiosyncratic characteristics and practices which differentiate the four projects, which are at the center of this dissertation, from each other. Such contextual understanding is essential for interpreting the micro-level activities, which are contained within distinct routine performances. Hence, an overall coding scheme was established to identify the information processing activities and functions performed within various routines (see

Table 11). This coding scheme can then be used to code a partial, random sample of the total number of sequences captured in my archival data, to establish their forms of information processing. This enables me to characterize what latent routine structure a specific cluster of routine performances actually represent. As the activity sequences in each cluster have already been determined to be structurally similar to each other by the clustering algorithm, it is fairly straightforward to identify the commonalities in information processing across analyzed activity sequences. I also identified repertoires of activities within each cluster, which collectively performed the information processing function each routine cluster serves. The identification of different repertoires of

58

activities expanded the granularity of the coding scheme, as more detailed types of activities within each cluster could be identified. For example, a category like “rewriting code” was expanded into “adding features”, “increasing code clarity”, and “increasing performance”.

Theoretical accounts of the generative mechanisms behind observed quantitative patterns emerged inductively, through a hermeneutic process (Boland, Newman, & Pentland,

2010) of iterative back-and-forth analysis of data on routines (as text) and the interview and archival data (as context). As such, in the analysis I can “zoom in and out” (Gaskin et al., 2014) across low-level routine data and an interpretive understanding of its social structuring to construct emic descriptions. Next, I will explain how the validity of the inferences that were made was established.

Assessing Validity The ways in which digital traces are generated creates certain epistemological implications. While these implications may be construed differently based on a researcher’s ontological and epistemological outlook, I have chosen to go down a pragmatist path that puts interaction at the center of social reality. Further, the strictly behavioral data that I have access to at this point implicates a rather positivist outlook on human behavior – we examine what we can observe. This does not mean that I am skeptical of interpretative studies or qualitative inquiry into the dynamics exhibited by various OSS projects.

On the contrary, my theoretical intuition tells me that we can expect substantial relationships between agents' exhibited behavior and the experience of the agents executing said behavior. Such a relationship is not necessarily a correspondence

59

relationship, but it is also not expected to be a random or disconnected relationship.

Rather, I expect there to be complex bidirectional relationships between the behavior exhibited by actors and their subjective experience of those behaviors. Using qualitative inquiry and integrating its findings with the quantitative findings I seek to ensure that the inferences I make based on behavioral trace data are also consistent with the emic accounts of the agents generating that trace data.

Specific validation frameworks for mixed methods studies have now grown quite sophisticated (Venkatesh et al., 2013) and provide specific guidelines for conducting such research. Using such a framework I evaluate the validity of my inferences using across two main aspects: design quality and explanation quality. The former related to the appropriateness of procedures used to answer my specific questions, and the latter related to the degree to which credible interpretations have been made based on the findings.

Each of these has multiple quality criteria. I outline these criteria, their related concerns, and the ways in which I addressed those concerns in Table 6 below.

60

Table 6. Mixed Methods Validation Framework

Quality Quality Concerns Responses Aspects Criteria Design Design Are the selected methods suitable to Sequence analysis and qualitative Quality suitability answer the posed research inquiry provides insight into both the questions? global patterning and local meaning of routine structures Design Quantitative: Is sampling, Data used for sequence analysis are adequacy collection, and measurement done based on the totality of data available in a reliable and rigorous way? for the sampled projects and measurement is done in a computational, as opposed to scale- based manner Qualitative: Is data collected and Qualitative data collection uses a analyzed in a way that ensures triangulation logic across digital- credibility and dependability? traces-as-text, interviews, and publicly available narrative documents Analytic Quantitative: Are analysis strategies Sequence analysis explicitly focuses adequacy appropriate to answer the research on process (emergence and evolution) questions and thus indicate and variation statistical conclusion validity? Qualitative: Are analysis strategies Grounded theory across both context appropriate to answer the research and text ensures validity and questions and therefore likely to plausibility generate theoretical validity and plausibility? Explan Quantitativ Quantitative: Do inferences follow Results are statistically reliable due to ation e Inferences closely from the data and thus large sample sizes, and short Quality indicate internal/external validity conceptual distances between and statistical conclusion validity? concepts and measurement Qualitative Do interpretations follow closely Qualitative inferences can be Inferences from the data and thus indicate triangulated across text and context, as credibility, confirmability, and well as confirmed by quantitative transferability? patterns Integrative Integrative efficacy: Are qualitative Due to the dual nature of digital trace inferences and quantitative inferences data, meta-inferences are based on integrated into theoretically alternative perspectives on the same consistent meta-inferences? data – thus ensuring consistency Inference transferability: Can Due to generic affordances provided inferences be transferred to other by a) the common technical platform, contexts or settings? and b) common cultural OSS mores, transferability across other ‘open’ contexts is expected Integrative correspondence: Does Meta-inferences provide rich accounts the meta-inferences satisfy the of the origin, evolution, and variation initial purpose of the study? of practices, both as global patterns and local action

61

Given the opportunities for tight integration that the dual nature of digital trace data affords us with, the validity of the meta-inferences (inferences integrated across both quantitative and qualitative evidence) is expected to be high. The research design is therefore expected to be sufficient for arriving at high-quality answers to the research questions. Next, I will provide a brief overview of how the research design was translated into three actual studies, after which the body of this dissertation – the three studies themselves – follows.

Overview of Three Complementary Studies

To address the research questions stated above, utilizing the research design, data, and analysis methods I have described, I conducted three studies (as illustrated in Table 7 below). Each of these studies grapple with one of the three major questions posed. The first study explores the origin and emergence of routine structures and the information processing functions that they perform, the second study examines the evolutionary trajectories of such routine structures across a proposed “OSS lifecycle” and the third study inquiries into the variation of routine structures across multiple OSS projects.

Together, these three studies break new ground in two major ways. First, they utilize conceptual and methodological tools that attend to structural aspects of OSS organizing which are not relational (such as network structure) – but rather focus on routinized, enduring, activity patterns. Through rigorous conceptualizations as well as quantitative tools that capture the structural variation in routinized structures, new vistas for inquiring into the temporally structured nature of organizing are opened.

62

Table 7. Overview of the three studies in this dissertation

Study # 1: The Origin of Routines 2: The Evolution of 3: The Variation of as Problem Solving Routines as Responses to Routines as Responses to Mechanisms Environmental Shifts Organizational and Technical Conditions Research What is the origin of routine How do routine structures How and why do routine Question structures and what change and evolve over structures vary across OSS functions do they serve in time in OSS development? projects? OSS development? Theoretical Information Processing Software lifecycle models Rationalities and attendant Foundation View (Galbraith, 1973, (Capiluppi, González- cognitive heuristics (Becker, 1974) Barahona, Herraiz, & 2005; Hutchins, 1995; Simon, Robles, 2007; Lehman, 1978) 1980; Rajlich & Bennett, 2000) Research Cross-sectional Longitudinal Comparative Design Data 1 case: Rubinius 1 case: Rails 4 cases: Rubinius, Rails, Bootstrap, & Django Analyses Clustering routine structures Tracing distributions of Comparing routine structures & coding routine clustered routine and their variation to performances as text performances over time contextual patterns Findings Four distinct routine clusters Routine variety scales with Routine variety is an adaptive form a system which controversy of features response to varying levels of adresses both simple and being developed across the technical complexity, driven complex problems release cycle by various rationalities Second, these studies provide a multi-faceted framework through which we can understand the dynamics of organizing in temporally and geographically distributed contexts which are mediated by digital platforms – as exemplified by the OSS context.

63

The framework and findings articulated across these three studies allows us to discern how low-level action sequences become bundled into routines which in turn are adapt to environmental variation (complexity). Such a comprehensive framework lays the groundwork for understanding how digital design work can be coordinated across distributed communities, as well as explaining potential variance in how such communities approach such endeavors.

Origin: A Cross-sectional Study In the first study I explore how coordination in OSS is possible and how we can understand OSS coordination based on routines. I adopt an information processing view

(IPV) whereby OSS development tasks present varying degrees of task complexity and need to be addressed by distinct and varying development routines. To understand the variance and composition of such distinct routines I conduct an exploratory case study utilizing mixed methods, focused on the variation in routines in a typical, mid-sized OSS project and theorize why such different routine classes emerge. I use sequence analysis to identify four types of routines: triaging tasks, transferring information, discourse-driven problem solving, and direct problem solving. Next, I explore these four types of routines through qualitative analysis to show what types of functions such routines serve. I find that two alternative dimensions of routine variety – entropy (i.e. diversity of activity types) and heterogeneity (i.e. diversity of activity ordering) – impact differential development outcomes such as the efficiency or the likelihood of successful problem solving as problems grow in complexity. I conclude by proposing a new theory of the ecology of OSS routines that explains how OSS project are likely to generate requisite

64

variety in their responses through routine discovery as to tackle development tasks of varying complexity.

Evolution: A Longitudinal Study The second study focuses on the change of routine structures over time and develops an

OSS development lifecycle model. To this end I conduct a longitudinal analysis, which integrates multiple methods (qualitative and computational), focused on a medium-scale

OSS project – Ruby on Rails. Again, I use sequence analysis to detect how heterogeneity of routines and the structure of discourse in the project ebbs and flows over time. I identify two forms of routines: discourse-driven and direct problem solving routines and trace their proportional distributions over time. I also identify three stages of OSS development during a release cycle: a) cleanup: exhibiting a balance between discourse- driven and direct problem solving, b) sedimentation: dominated by direct problem solving, and c) negotiation: dominated by discourse-driven problem solving. These stages manifest different intensities of discursive shaping of software features across the community. I qualitatively explore this discourse and note that discourse-driven problem solving typically is associated with some form of controversy whereas direct problem solving is not. Accordingly I propose a process model distinguishing between discourse- driven and direct problem solving and their different roles throughout the lifecycle.

Variation: A Comparative Study The third, and last study examines variance in the development patterns across four projects. These patterns are project-wide routines, which form emergent methodologies for developers to use in order to respond to information processing needs emerging from the environment. I advance an analysis and explanation of variation – or heterogeneity –

65

in OSS development routines. Variation in routine heterogeneity manifests variation in the project’s information processing capacity and such variation is viewed as crucial for satisfactory performance, since circumstances vary between different development projects. Again, through an exploratory sequence analysis of digital trace data I uncover substantial variations in routine heterogeneity across four OSS projects: Rails, Django,

Rubinius, and Bootstrap. To explain such variation, I conduct a qualitative inquiry to uncover the generative mechanisms behind the observed variations in routine heterogeneity. I find that routine heterogeneity is influenced by “information processing needs” that emerge from problems associated with implementing code in each project. To address such needs, projects deploy alternative strategies characterized by their different forms of rationality – principled or discursive. These strategies align routine heterogeneity with varying information processing needs faced by different projects. The study concludes by outlining a process model of how development routines vary, thereby articulating how distinct guiding rationalities influence the degree of routine heterogeneity during OSS development.

The Structure of a Three-Paper Dissertation In line with contemporary publishing-centric trends, this dissertation is organized around three papers. Each of the papers will be displayed in their entirety below. While each of them has distinct foci, as outlined above, they do share similar problem statements, theoretical frameworks, and methods. Therefore the reader may experience some overlap in content. After each of the three papers I will summarize the main theoretical, methodological, and practical contributions of the dissertation. I will finish by outlining the limitations, and the future research stream that this dissertation is intended to launch.

66

Study #1: The Origin of Routines as Problem Solving

Mechanisms

Open Source Software (OSS) projects present us with an apparent conundrum: these projects involve independent, high-turnover, geographically distributed, volunteer developers that typically do not know each other. Nevertheless, OSS developers somehow succeed in coordinating complex task interdependencies necessary to construct sophisticated software. Explanations to this conundrum highlight two related ideas: modularization (Kogut & Metiu, 2001) and layering, or ‘superpositioning’, of code

(Howison & Crowston, 2014; Yoo, Henfridsson, & Lyytinen, 2010). Both explanations rest on how OSS developers reduce complexity to enable simultaneous work by independent developers. Modularization involves the separation of code into distinct, independent units, and superpositioning describes how independent developers can then address simple, discrete tasks by layering code on top of existing modular elements.

Modularization is an attribute of the code itself, whereas superpositioning describes the development process that ensues given the modular structure.

However, code does not modularize itself and often development tasks that at first glance appear simple may actually be quite complex. For these situations OSS projects appear to offer limited options for the a priori coordination necessary to handle such complexity.

Research into traditional forms of software development suggests that the successful development of complex software depends on the presence of highly structured and adequately diverse organizational processes, or ‘routines’, which are necessary to

67

navigate the uncertainty and ambiguity associated with managing interactions within and between code elements (Kraut & Streeter, 1995; Zmud, 1980). This literature assumes, however, (due to the presence of hierarchical control) that the means available for the development team for processing necessary information can be formally specified beforehand (Sabherwal, 2003) by establishing strict methodologies (see e.g. Nidumolu

1995), or by using formal coordination mechanisms such as project plans and explicit requirement documents (Andres & Zmud, 2002), as well as by relying on authority-based governance (Kirsch, 1996). However, in the case of OSS, where the presence of these elements is weak or nonexistent the question emerges: how does the necessary coordination emerge to tackle complex development tasks given the decentralized and ad hoc interactions between developers (Benbya & McKelvey, 2006)?

In addressing this question I draw upon Ashby’s (1956) Law of Requisite Variety3 and the Information Processing View (IPV) of organizational coordination (Galbraith, 1974).

IPV is rooted in Ashby’s Law, and has been widely used in theorizing about organizational coordination and related structuring challenges (Crowston, 1997). I chose this view because it provides us with a relatively simple language to discuss task complexity in terms of information processing needs associated with OSS development, and a way to analyze related coordination challenges. Developers somehow coordinate with each other to address development tasks of varying complexity, and repeated coordination leads to the emergence of routines. Development routines can be defined as

3Ashby’s Law of Requisite Variety states that it takes variety to destroy variety. This is the fundamental theoretical insight for understanding how to manage and navigate complexity in a variety of disciplines, and is therefore an appropriate foundation for understanding how OSS communities handle complex software tasks. Galbraith drew upon Ashby’s ideas to characterize how organizations deal with complexity.

68

typified activity sequences for handling the information processing needs associated with solving development problems of varying complexity. Routines persist over time and are both results of and conditions for continuing coordination (Feldman & Pentland, 2003).

To understand the role of routines in coordinating OSS activities, I ask:

a) What distinct types of routines are present within OSS projects?

b) What information processing functions do each of these routines provide for

dealing with development tasks of varying complexity?

c) How do routines, when carried out as a joint aggregate enable OSS developers

to tackle more complex tasks?

To answer these questions I carry out an exploratory, mixed-methods (qualitative and computational) case study of a mid-sized OSS project named ‘Rubinius’ with significant complexity. Rubinius is a virtual machine and interpreter for the Ruby programming language used widely in the development of web applications (e.g. Ruby on Rails). I analyze the complexity and variety of development routines using sequence analysis and statistical clustering techniques (Gaskin, Berente, Lyytinen, & Yoo, 2014) and detect four routine clusters. Further, I characterize the information processing functions of each routine cluster in terms of its routine variety: variation in unique activity types involved

(entropy) and their sequencing (heterogeneity) within each routine. I then specifically address routines that deliver functional code and dig more deeply into how variety in these routines impacts outcomes such as their problem solving success and efficiency. I conclude by proposing a process model for OSS development in terms of how the four types of routines jointly constitute a coordination system, which generates the requisite

69

variety necessary to address incoming streams of development tasks of varying complexity.

Coordination and Variation in OSS Development OSS projects often create sophisticated software despite the presence of several adversarial conditions which are antithetical to successful coordination, including physical and temporal distribution (Herbsleb & Moitra, 2001). Contemporary explanations of effective coordination of OSS rests on the modularity thesis (Baldwin &

Clark, 2000; Parnas, 1972; Simon, 1962). Using modularization techniques, development tasks can be decomposed into manageable, independent pieces that can then be tackled by individual, solitary developers (Ovaska, Rossi, & Marttiin, 2003). Building on this idea,

Howison & Crowston recently (2014) argued that OSS developers build layers of code over time (i.e., ‘superposition’) through the performance of incremental tasks. These incremental tasks focus on relatively simple development problems, which can be attacked by solitary programmers. Conceivably, OSS communities are not well-equipped to handle more complex tasks head-on, because of the rather ‘thin’ knowledge conduits afforded by the virtually mediated relationships within OSS communities. Rather, tasks need to be queued until layers of incremental code contributions somehow render the tasks less complex. Excessively complex tasks are abandoned while simple tasks are retained for further processing.

Through the process of ‘superpositioning’, OSS development tasks remain simple and can be primarily tackled by individual developers. At the same time, coordination costs are minimized. Here, initial modularization is a necessary pre-condition for successful superpositioning, while the incremental layering of code that ensues is a characteristic of

70

the development process. Hence, both modularization and incremental layering of code follow the same principle: through tackling only bits and pieces of tasks at a time, larger tasks become divided into smaller, relatively independent tasks, which are easier to solve.

For these strategies to be successful, however, one of two assumptions must hold: either all OSS development tasks are inherently simple, or development processes have the capacity to turn complex tasks into simple tasks through simple activities. The first assumption is naïve and contrary to evidence from the extant literature on OSS, and the second begs a fundamental explanation – what sort of processes will transform complex development tasks to ones that can be solved by lone developers, performing incremental layering of modularized code?

How OSS development tasks and processes relate to one another can be conceived in light of Ashby’s Law of Requisite Variety (Ashby 1956). The law suggests that to adequately control a system, the number of states available to the control mechanism (i.e.

‘variety’ or ‘complexity’ of the controlling system) must be equal to or greater than the number of potential states of the system (i.e. ‘variety’ or ‘complexity’ of the system under control). Stafford Beer (1984), applied the law of requisite variety to problem solving tasks in organizations, and identified two tactics for handling complex problems.

The first tactic is to reduce the complexity of the problem to a level that the problem solving system can handle (‘attenuate’ the complexity of the problem). The second tactic is to increase the capacity of the problem solving system to handle the problem (‘amplify’ the amount of complexity that the problem solver can absorb). In the context of an OSS development problem, the level of complexity made available by the structures of an OSS project (i.e., its community and development processes), which generates the solution,

71

must (at least) match the complexity of the task itself. Accordingly, the complexity of tasks dealt with will be mirrored in the complexity that associated organizational structures such as routines, relationships, or other coordination mechanisms exhibit

(Pentland, 2003, Baldwin & Clark, 2000). However, modularization and superpositioning in OSS only attend to the complexity ‘attenuation’ tactic.

The combination of modularity and superpositioning attenuates complexity and decreases the need for developers to actively coordinate their task interdependencies (Howison &

Crowston, 2014). However, the key assumptions of this approach – initially modularized code and incremental layering of additional code – do not account for how complex tasks become simplified in the first place. Therefore, such accounts imply that OSS development communities do not possess requisite variety to handle more complex tasks until the complexity of tasks has been reduced. How this reduction occurs, however, is not addressed in the current theory. Can individuals who only deal with simple tasks adequately decompose complex problems? This claim contradicts the fundamental idea of Ashby’s law: low variety never reduces high degrees of variety. Rather, the system leveraged to solve tasks needs to possess the requisite variety of states necessary to match the complexity of tasks through amplification. In this study I address this issue of how amplification happens in OSS. To do so, I draw upon the information processing view of organizations.

An Information Processing View of OSS Development In his seminal work, Galbraith (1974) drew upon Ashby’s Law of Requisite Variety to explain how organizations structure themselves to handle the complexity of their environment. Accordingly, for decades IPV has been a foundational lens for exploring

72

the emergence of varying organizational structures (see e.g. Eisenhardt et al., 2010).

Galbraith observed that in order to attend to complex situations, organizations must be able to process the information associated with those situations. Highly complex situations would require an organization to process more information than simple situations; i.e. information processing requirements must always be matched by the variety provided by selected organizational structures for processing information which are thereby ‘fit’ or ‘aligned’ to the variety of the information processing requirements. If this does not happen performance deteriorates. Galbraith proposed a number of coordination structures, amongst others; organizational working arrangements (i.e. hierarchy, teams, etc.), organizational routines, and computer-based information systems.

Traditionally (non-OSS) software development has been organized through formal organizational structures such as organizational units, project teams and steering committees, as well as through development routines rooted in particular development methodologies (Fichman & Kemerer, 1992) – such as agile and waterfall approaches (e.g.

Vidgen & Wang, 2009). OSS projects, on the other hand, generally involve weak authoritative and hierarchical relationships and methodological structures for coordination are typically not enforceable. Instead, OSS projects rely upon technical platforms (e.g., Sourceforge, Github etc.) to coordinate activities. These platforms cannot in themselves process task information though they can store and trace such information.

By doing so they coordinate the activities of developers that process the actual information. However, it is not necessarily clear up front when a task is defined how much information processing a task will require or how to carry it out. In the case of

OSS, information must be ‘scouted’ and ‘processed’ in order to make sense of incoming

73

feature requests and bug reports, to inquire into code interdependencies, and to write and test the code. These activities are performed through ‘routines’ that contingently emerge in OSS communities (von Krogh, 2005); they are patterned sequences of activities which are continually reenacted and enable diverse actors to coordinate task interdependencies over time (Nelson & Winter, 1982). Routines store organizational knowledge and heuristics that help developers solve problems. Through drawing upon such resources,

OSS developers can coordinate interdependencies across tasks necessary to address faced development complexity. Routines in OSS typically emerge over time – a management hierarchy does not specify them a priori. Thus, such routines are often configured ‘on the fly’ to generate the requisite variety necessitated by the tasks at hand.

I posit that task complexity has two important aspects, which are central to the conduct of software development tasks. Wood (1986) refers to the first as ‘component complexity’ – the number of activity types that must be executed to complete a task, and the second as

‘coordinative complexity’ – the degree to which distinct activities of a task are interconnected. Hence, task complexity increases when the number of activity types and their interrelationships increase. As the number of required activity types increases, the overall uncertainty associated with information processing increases, because each additional activity type signals the need to process additional information (Galbraith,

1974). As the interrelationships between activities increase, the ambiguity of information increases, because it signals that bits of information may be conflicting, or differently interpreted in the light of other bits of information at hand (Daft & Lengel, 1986; Daft &

Weick, 1984; Weick, 1979). Hence, task complexity overall is composed of uncertainty and ambiguity which each has different information processing requirements, hence

74

different routine structures are necessary to deal with these requirements. The focus on

‘fit’ or ‘alignment’ within IPV implies that an OSS project can successfully only solve those tasks for which it can generate requisite degrees of variety. If this is not the case, the development task can only be carried out by aligning and modifying OSS routines to generate requisite variety (Venkatraman & Camillus, 1984; Venkatraman, 1989). That is, task complexity must always be matched by the routine variety – by inducing new variation in performed activity types and their ordering.

To clarify what routine variety means in terms of the structuring of the routines, I next refine the idea through drawing on the complexity literature (Prokopenko, Boschetti, &

Ryan, 2009). While overall definitions of complexity are contested, they tend to converge on a number of features such variety and volume of components, a multitude of interactions between components, and self-organizing or emergent global patterns arising from interactions. These features give rise to uncertainty and ambiguity, i.e. predicting any individual outcome of a problem-solving process is difficult, and the overall range of possible outcomes is poorly defined. The former increases as the number of components increase, and the latter increases when the number of interactions between components increase. Together they constitute two essential aspects of variety. Hence, routines need to provide information processing capacities, which can deal with both uncertain and ambiguous problems. Here, I introduce two aspects of routine variety, which shows how this is accomplished: entropy and heterogeneity.

Entropy captures the capacity of routines to process the uncertainty of information

(Shannon, 1948) – i.e. the amount of information with regards to a task that needs to be extracted and processed (information about a problem is changed into information about

75

a solution). For example, uncertainty may relate to the specific location of the problem in the codebase. Accordingly, highly entropic routines contain multiple types of activities, which therefore can increase the capacity to handle the uncertainty inherent in a particular development task so that a solution can be obtained. While such routines may entail the processing of large amounts of information, they do not deal with ambiguous information.

Heterogeneity (Salvato, 2009; Studer, Ritschard, Gabadinho, & Müller, 2011) refers to the capacity of routines to process the ambiguity of information. For example a technical problem may not have an unequivocal solution, but rather several possible solutions, of which the preferred alternative must be established through social negotiation. This necessitates structuring of routines, which can handle multiple interpretations and compare multiple design implications (possible solutions) emerging from the development task. Such routines conduct multiple inquiries through iterating a diverse set of activities across several information processing cycles. Essentially this entails a many- to-one mapping between multiple interpretations of a design problem and its eventual solution. Highly heterogeneous routines thus consist of a repertoire of activities ordered in varied ways across enactments. Heterogeneous routines enable developers to tackle not only uncertain, but also ambiguous, and therefore more complex problems that require non-linear solution processes – those that require a great deal of interpretation and sensemaking in both problem formulation and problem solving (Daft & Weick, 1984).

The key terms associated with this information processing view of OSS development are summarized in Table 8.

76

Table 8. Definitions for IPV Concepts in OSS

Concept Definition & Example The overall uncertainty and ambiguity of a task, captured by the amount of activity Task types and their degree of interrelationships within a development task. Higher Complexity complexity implies more information processing needs (e.g. debugging a chain of interdependent functions is more difficult than debugging a single, isolated function). The overall variation of activity types and their ordering within a routine (e.g. a Routine routine with many activity types ordered in heterogeneous ways has more variety Variety than a routine with few activity types ordered strictly sequentially without repetition). The capacity of routines to process the uncertainty of information, captured by the variation of activity types. More entropic routines indicate that a larger amount of Entropy uncertainty is being processed, without necessarily representing complex relationships between multiple bits of information (e.g. solving a problem that involves rewriting a large amount of independent bits of code, such as several tests). The capacity of routines to process the ambiguity of information, captured by variation in the sequential order of activities. More heterogeneous routines inquire into multiple interpretations of information and multiple possible solutions (e.g. a Heterogeneity long debugging process which involves multiple tests, probes, and experiments is more varied and generates more novel solutions than a short sequence of similar activities which fixes a discrete problem). A specific performance of a routine involving specific time-stamped activities Activity connected to a single pull request identification number (e.g. a sequence of the sequence following activities: pull request-comment-merge-close). A cluster of activity sequences with similar structural characteristics, which emerges from the enactments or performances of developers, geared towards processing Routine information (e.g. a problem solving routine which consists of long, heterogeneous sequences geared towards matching task complexity). I posit that when developers can leverage routines with the requisite degrees of entropy and heterogeneity vis-à-vis tasks of varying complexity, it is reasonable to assume that they are able to complete the full range of such development tasks. Single developers are able to grasp the entire problem space associated with simple tasks, which reduces the need for task coordination. We can expect that executing such tasks will be conducted through highly entropic routines, which can translate discrete bits of information about a problem efficiently to a discrete solution. The problem space associated with complex tasks, however, is ambiguous and conflicted. We can expect the execution of such tasks to involve highly heterogeneous routines consisting of multiple cycles of iterated diverse

77

and complex sets of activities. To coordinate such complex routine patterns, developers need ways to a) identify the degree of task complexity that a certain task entails, and b) find ways of generating requisite amounts of entropy and heterogeneity so as to match the complexity of tasks. Guided by my three research questions, I next conduct an exploratory study of an OSS project in an effort to explore how routines emerge to handle complexity.

A Cross-Sectional Study of OSS Routines: Task Complexity and Routine Variety My study focuses on a medium-sized, successful OSS project – Rubinius

(http://rubini.us/) – that develops a virtual machine (VM) implementation of MRI (Matz’s

Ruby Interpreter, the original Ruby interpreter) written in the Ruby programming language. Rubinius is regarded as a flagship project for the Ruby community – it is essentially an opportunity to showcase the power of the programming language. At the time of the study, a major feature was being implemented by Rubinius: adapting to computers with multiple CPU cores, which allows for concurrent multi-threaded processing. Due to the complexity related to the parallelism of new CPU technology the project offered an opportunity to examine both simple and complex tasks associated with the new implementation.

To address my research questions I conducted a mixed methods case study (Gaskin et al.,

2014) of Rubinius routines. This involved a multi-step process that utilized multiple forms of data and analysis. An overview of the two-pronged study process is depicted in

Figure 5. After the case had been selected, both computational (upper half) and qualitative (lower half) work ensued. The computational work consisted of extracting digital traces, using cluster analysis to elicit routines from the digital traces, extracting

78

relevant descriptive statistics for each of the clusters, and conducting regressions. The qualitative work consisted of conducting interviews and collecting archival data (i.e., public audio, video, and text), coding the interviews and archival data to establish a coding scheme, which was then used to code a sample of the activity sequences within each routine. To ensure the validity of inferences made and to construct a theoretical framework through interconnecting extracted categories, I compared (‘zoomed in/out’) across qualitative coding and computational analyses.

Figure 5. Research Process

By conducting both qualitative and quantitative analyses, I am enabled to establish consistency across three forms of evidence: 1) contextually-derived theoretical categories

(elicited from interviews and archival data), 2) qualitative coding of micro-level routines, and 3) descriptive and inferential statistics emerging from computational work.

Following principles of interpretative field research (Klein & Myers, 1999), I iterate back-and-forth between analysis of data on routines (as text and numbers) and the interview and archival data (as context). As such, my analysis “zooms in and out”

(Gaskin et al., 2014) across micro-level routine data and an interpretive understanding of its contextual significance. This allows us to draw inferences with regards to how the information processing functions of each routine relate to each other so as to form a system that jointly addresses tasks of varying complexity.

79

Case selection I selected Rubinius as a case study for three reasons: 1) it is a typical example

(Eisenhardt, 1989) of a relatively successful OSS project; 2) it has had multiple official releases and has gained traction within the larger web development community that sprang up around the Ruby programming language in general, and the Ruby on Rails web development framework in particular; 3) the project is medium-sized, but technically very challenging, and its implementation involves a high level of technical uncertainty. I therefore argue that understanding the dynamics of this case helps us gain further insights

(Yin, 2008) into the emergent coordination of medium-sized, complex OSS projects, which often lack the elaborate and explicit governance structures shared by the largest projects of the OSS world (e.g. Linux, Apache, Mozilla).

Data collection

The main portion of the data was available in the natural medium of OSS development – i.e., online communication. This enabled us to conduct a study over the course of 12 months of development activities (January 6th 2012 to January 6th 2013) and to triangulate the data across multiple data sources associated with working on the 2nd main release of the software – a critical period in the project. The data available covers digital traces of activities recorded by the version control system Github (https://github.com/), along with the public archival data on Rubinius development (public interviews, blog posts, conference talks etc.). We also conducted interviews with founders, corporate sponsors, core developers, as well as peripheral developers. This afforded us rich access to the detailed, inner workings of the project. The vast majority of work took place on the

80

Github version control system. My approach treats all this data as inputs to the process of formulating theory.

Extracting digital traces Routines are rendered measureable through Github, which records time-stamped sequencing of all activities, associated actors, activity connections, and activity content, all organized by pull requests (suggested code changes). I operationalize a pull request as a sequence of activities connected by a common pull request identification number (PID).

Pull requests are used by developers who do not have access rights to edit the repository directly, or when developers want feedback on their suggested code changes. A pull request can contain multiple commits (i.e., detailed changes to specific pieces of code).

This is code that a developer has merged into his/her own, local copy of the code, and now wants the project owner to ‘pull’ so that it can be merged into the common baseline copy of the code. The request is then discussed and renegotiated before a decision is made whether to merge the code into the code baseline, to reject the request, or to postpone work on the request. The types of activities that are logged are specified by the

Github platform itself, and are defined by the designers of the platform and are detailed in

Table 9.

In order to extract the digital traces of activities (Anjewierden & Efimova, 2006), I developed several scripts using the data mining toolkit by Gousios & Spinellis (2012), each of which are captured in Appendices A-D. These scripts capture every activity related to each pull request during the given time period. I extracted 686 distinct PIDs comprising 3,704 activities. All of these activity sequences can be analyzed computationally as well as qualitatively as texts of bug reports, discussions around how

81

to fix bugs, and how the eventual code fixes were done.

Table 9. Activity Frequencies per Cluster

Transferring Discourse-driven Direct problem Information problem solving solving Triaging Problems Activity Types Freq. % Freq. % Freq. % Freq. % assigned 0 0.0 2 0.1 0 0.0 1 0.5 closed 46 17.8 410 13.1 300 27.0 101 47.0 comment ed 46 17.8 831 58.7 157 14.1 100 46.5 mentione d 14 5.4 394 12.6 30 2.7 2 0.9 merged 1 0.4 106 3.4 261 23.5 1 0.5 opened 1 0.4 113 3.6 149 13.4 5 2.3 referenc ed 150 57.9 177 5.7 140 12.6 3 1.4 reopened 1 0.4 13 0.4 1 0.1 2 0.9 reviewed 0 0.0 73 2.3 73 6.6 0 0.0 Total 259 100.0 2119 100.0 1111 100.0 215 100.0

Interviewing and collecting archival data For interviews I sampled 17 informants from each key stakeholder group of Rubinius, including founders, corporate sponsors, core team members, regular contributors, and peripheral contributors. Interviews were dynamically structured to tap into each interviewee’s actual experience of developing code for Rubinius. The duration of each interview ranged from 45 minutes to 3 hours and common questions were of the type:

“Please describe a recent contribution you made. What did you do? What were you thinking?” The dataset also includes public data sources such as audio (podcasts and interviews, ranging in duration from 1 to 2 hours each), video (conference keynotes, panels, talks, and interviews, ranging in duration from 30 minutes to 3 hours each), and text (blog posts, interviews, and documentation). I also collected all emails from the

Rubinius mailing lists and chat conversations from IRC archives. Interviews were

82

transcribed, external parties had already transcribed public audio, and specific sections of public video were transcribed based on their relevance to my research questions. See

Table 10 for a summary.

Table 10. Data Collection

Measure N Comment All 3 Rubinius core developers, 12 peripheral developers, and 1 Interviews 17 Senior VP Public Audio 3 Podcasts and other audio interviews conducted with core developers Public Video 3 Videotaped conference talks conducted by core developers Public Text 8 Blog posts and interviews conducted with core developers Archived IRC conversations between core developers, developers, Public IRC 1000+ and users Mailing list conversations between core developers, developers, and Public Email 100+ users Routines Distinct pull requests on Github 686 (3,704) (Activities)

Computational analysis Computational analysis of activity sequences – called sequence analysis – permits us to analyze activity patterns’ entropy and heterogeneity (Gabadinho, Ritschard, & Studer,

2011). These methods have emerged in sociology mainly to investigate life courses and careers (Abbott & Hrycak, 1990) or spatial movement of actors (Wilson, 2001). Results of sequence analysis can be further organized into compositionally similar clusters of activities (e.g. Sabherwal & Robey, 1993). The clustering of sequences and the measurement of heterogeneity are derived from the calculated distances between sequences. In sequence analysis, this distance represents the degree of dissimilarity between two sequences as calculated using Optimal Matching (OM) methods. Consider the following two pull request sequences (1 and 2) of Github digital trace data:

1: open / comment / merge

2: open / comment / close

83

In order to measure the similarity in these activity patterns – the extent to which sequences align, or are similar – we need to estimate the effort required to transform one of the sequences into the other. In this example, we have to replace the ‘merge’ activity in sequence 1 with a ‘close’ activity to arrive at sequence 2. Assuming that the cost for a single conversion is set to 2, the total cost or distance between the sequences is 2 (Abbott

& Hrycak, 1990). Converting sequences with more differences results in higher distances, which means that the sequences are more dissimilar while lower distances indicate more similarity. The total distance between two sequences is called the OM distance. The OM distances between every pair of sequences in a given set constitute a distance matrix. By exploring the distance matrix we can observe that a set of sequences with only small distances implies low heterogeneity. We are more likely to observe this when sequences are short or when activity sequences follow highly common patterns. In another set of sequences we may observe larger distances, implying greater heterogeneity. We are more likely to observe this when sequences are long, and exhibit unique patterns in the ordering of activities.

Clustering algorithms use the distance matrix as an input to derive clusters of sequences

(in my case, pull requests) based on their OM distances. The emergent clusters can then be considered routines – families of similarly patterned activity sequences. The clustering algorithm does not by itself specify the number of clusters, but provides goodness of fit statistics for evaluating multiple clustering solutions. In general, these fit statistics evaluate how well each clustering solution reproduces the original distance matrix as well as how convergent and discriminant the clusters are (Studer, 2013). As suggested by

Studer (2013) I used several standard statistics to determine a suitable number of clusters:

84

Point Biserial Correlations (PBC), Hubert’s Gamma (HG), and Average Silhouette Width

(ASW). PBC and HG measure the capacity of the clustering to reproduce the distance matrix; whereas ASW measures the degree to which distances between routines are small within clusters and large between clusters.

Descriptive statistics and metrics After extracting clusters, I moved on to extract descriptive statistics for each of the clusters. These statistics were extracted using a combination of the already analyzed sequence data, query scripts, and inferences drawn from the coding of the qualitative data. To measure routine variety I calculated the heterogeneity and entropy of each cluster (further described below). To further measure the extent and number of activities related to each pull request, I calculated their average duration, typical number of participants, and the average number of related activities. To capture the successful coordination of problem solving activities, I calculated the percentage of pull requests which were eventually merged into the codebase. To assess the complexity of code I calculated the average net Lines of Code (LOC) attached to each pull request, average numbers commits (distinct code patches) attached to each pull request, and average numbers of files modified by each pull request.

The degree of heterogeneity of each cluster is calculated as the average of all the OM distances within that cluster, normalized by the length of the sequences. This indicates the overall heterogeneity in the sequential ordering of activities. I also measure the entropy of activity distributions in each routine using Shannon’s concept of entropy

(Shannon, 1948) defined as the ‘uncertainty’ of predicting the event distribution in a given time period (Gabadinho, Ritschard, & Studer, 2011). The entropy is calculated by

85

summing the square of the proportion of occurrences of each event type multiplied by the natural logarithm. The highest entropy is achieved when a sequence contains only a single instance of every activity type. Any duplication of activity types in the sequence decreases entropy. The measure suggests that highly entropic sequences are efficient – they contain a minimum number of activities needed to reach a certain outcome. Last, to counter the claim that the difference between entropy and heterogeneity simply boils down to differences in length each of these measures were correlated with the length of routine enactments, revealing relatively minor correlations (heterogeneity correlated with length at 0.05, entropy correlated with length at 0.43, while heterogeneity and entropy correlated with each other at 0.16).

Data Coding

Generating a Coding Structure After routine clusters had been identified using computational methods, I complemented the ‘thin’, quantitative information on routine clusters with their ‘thick’ descriptions

(Geertz, 1973) and related analysis in terms of their information processing functions.

This was done in three steps: qualitative analysis of interviews and archival data, structured coding of routines, and “zooming in and out” (Gaskin et al., 2014) across routines and their context. First, I analyzed interviews and archival data using an open and axial coding approach informed by the grounded theory tradition (Glaser & Strauss,

1967), which helped us to elicit descriptive codes with regard to information processing tasks such as analyzing tasks, writing code, or collating distinct pieces of information to solve specific tasks. Open coding yielded categories depicting types of information processing activities and their structuring in routines. The process yielded 432 text excerpts along with 13 analytical memos and 87 distinct low-level codes. Second,

86

categories, or a more general coding scheme (see Table 11 below) was constructed through axial grouping of these low-level codes to code sequence-level archival data. The coding scheme consisted of activity categories describing types of things that developers do during the course of writing software, such as ‘reporting problems, ‘assigning blame’,

‘probing code’, and ‘rejecting code’.

Table 11. Coding Scheme

Code Explanation & Example Reporting problems Alerting the community to the presence of a perceived problem (e.g. a bug or a feature request) using the official Github issue tracker Writing code Attaching code to a pull request so as to suggest a specific solution to a specific problem (e.g. fixing a bug) Rewriting code Rewriting code attached to a pull request based on specific demands being placed on the contribution by other (often core) developers (e.g. updating code to make it more clear) Probing code Asking developers for additional information or conducting tests to elicit additional information (e.g. asking which versions of relevant software is being used) Transferring Establishing references across different activity sequences so as to connect distinct information across pieces of information (e.g. connecting two related problems) sequences Assigning blame Identifying the source of a problem as stemming from user error or external artifacts (e.g. indicating how the wrong parameters have been specified, thus generating a crash) Rejecting code Making a decision to reject suggested code written to solve a problem (e.g. rejecting code changes because they cause other problems) Merging code Making a decision to accept suggested code written to solve a problem (e.g. accepting code changes improving performance of a specific function)

Coding Activity sequences The coding scheme was subsequently applied to a random sample of 25% (172 out of 686 sequences) of the activity sequences in each cluster, so as to be able to characterize the overall information processing functions and supporting repertoires of activities. This is intended to yield insights into what work each type of routine performs, and how it does it. As the activity sequences in each cluster had already been determined to be structurally similar to each other by the clustering algorithm, it was fairly straightforward to identify the commonalities in information processing across analyzed activity

87

sequences. I also identified repertoires of activities within each cluster, which collectively performed the information processing function each routine cluster serves. The identification of different repertoires of activities expanded the granularity of the coding scheme, as more detailed types of activities within each cluster could be identified. For example, a category like “rewriting code” was expanded into “adding features”,

“increasing code clarity”, and “increasing performance”.

Zooming in and out across regressions and qualitative data analysis to generate theory Next, I used both logit and GLM regressions to assess the effects of routine variety

(entropy and heterogeneity) on the probability of code getting merged (signaling successful coordination of a development task) and the duration of pull requests

(signaling the efficiency of the coordination process). The regressions were conducted for the full dataset, as well as for subsets derived during the empirical study. I controlled for the effects of the defined code metrics. To capture the explanatory power of the logit models I used Nagelkerke pseudo-R2. Both entropy and heterogeneity were standardized, meaning that the coefficients should be interpreted in terms of an increase in standard deviation. Subsequently, I sought answers to the question of how developers utilized routines to tackle more complex tasks. Therefore, the emerging theory is essentially an explanation of how routine structures come to adopt certain characteristics as a response to tackling tasks of varying complexity. This seeks to explain how the structure of activities involving higher degrees of routine variety is associated with more complex code (in terms of net LOC, number of commits, and number of files touched).

The theory itself emerged from the zooming in and out across different components of the evidence available. As such it was iteratively refined so as to achieve consistency

88

across quantitative and qualitative data. The resultant theory therefore represents a set of abductive (Locke et al., 2008) statements, which make sense of the patterns in the data.

These statements are both process- and variance-oriented, and aims to explain how certain variables correlate due to specific generative processes.

Findings I organize the findings based on the three stated research questions. Per RQ1 I identified four clusters of activity sequences labeled as: 1) triaging problems; 2) transferring information; 3) discourse-driven problem solving; and 4) direct problem solving. To address RQ2 I analyzed two of these four, “discourse-driven problems solving” and

“direct problem solving”, which both resulted in substantial amounts of merged code. To address RQ3, I next investigated these two classes more deeply and as to identify which had greater variety and which dealt with more complex tasks. To do so I investigated how the structure of the routines (in terms of their variety) impact the likelihood of successful development tasks (i.e. merging the code) and efficiency of the routine in terms of the temporal duration of the activity sequence. This inductive, exploratory process is summarized in Figure 6 and findings are detailed below.

Empirical Findings

Computational 3. Explore relationship 1. Identify four of routine variety on clusters problem solving success and efficiency Theory generation: Rubinius on GitHub Process model of OSS coordination

2. Explore four Code activities clusters Qualitative

Figure 6. Structure of Findings

89

Routine Variety: Four Clusters of Routines An evaluation of the possible cluster solutions is shown in Figure 7. Here we can see a distinct ‘elbow’ at four clusters, and after that the added explanatory power, as evaluated by the fit statistics PBC, HG, and ASW, of using larger numbers of clusters starts to flatten out. Table 12 identifies the main information processing function of each cluster based on themes, which emerged from the qualitative coding, as well their key quantitative properties.

Figure 7. Evaluating cluster solutions Each identified activity sequence in Figure 8 is plotted as a single point in a Euclidean space where the distance between two dots represents the OM-distance between two activity sequences. It is important to note that because it is a Euclidean space, the X- and

Y-axes do not have direct interpretations. Rather, the graph should be seen as a type of coordinate system, which allows the reader to visually estimate distances between observations (i.e. distinct routine enactments). Hence, the overall spread of dots within a single cluster represents the overall detected heterogeneity within that cluster. For example, we can see that the routine cluster with the lowest heterogeneity, triaging problems, is represented by essentially overlapping dots – illustrating that the activity

90

sequences within this cluster are very similar to each other. In contrast, discourse-driven problem solving is visualized as dots spread over a larger area, indicating that the overall heterogeneity is much higher. The overall positioning of clusters also tells us about the similarities across clusters. We can see that the “triaging problems” cluster is represented by a tight grouping of dots, far removed from the other clusters. This indicates that it has low internal heterogeneity and is structurally far removed from the other clusters.

Figure 8. Visualization of Routine Clusters

Further, to validate the notion that the identified patterns are routines, i.e. that they are persistent across time and do not just emerge as statistical patterns in random data, I cut the dataset into two time periods. This was done through calculating the temporal midpoint of each routine enactment, and then cutting the dataset into two (with 686/2 routine enactments in each half) – an earlier and a later period. A cluster evaluation similar to the one conducted above was executed, and the fit statistics indicated that two clusters was the optimal solution in the early time period, whereas three clusters was the optimal solution in the latter time period. Through examining activity frequencies of each

91

of the clusters I determined that the two main clusters in each time period corresponded to direct- and discourse-driven problem solving. For example, the ‘discussed’ activity type dominated the discourse-driven problem solving clusters in each time period at

58.73% and 59.24% respectively. Similarly, the direct problem-solving clusters featured the ‘merged’ activity at 18.67% and 22.16% respectively. Further, the third cluster in the second time period corresponded to the transferring information cluster (38.51%

‘referenced’ activities). These results validate the notion that the two routine clusters, which will be shown to be central to my argument: direct- and discourse-driven problem solving, are persistent across time.

The varying degrees of heterogeneity and entropy indicate that each cluster contains structurally different patterns of activity sequences. The two routine clusters with the lowest routine variety – triaging tasks and transferring information – also merge the lowest fractions of code (1.0% and 1.5% respectively). Hence, these are less complex and mostly standardized routines, which support, rather than constitute, the activities geared towards actually solving problems. The two routine clusters with higher degrees of routine variety (direct problem solving and discourse-driven problem solving) merge code more often (86.9% and 15.8% respectively). Hence, these are more complex and varied routines geared towards solving problems of varying degrees of complexity.

Further, using a non-parametric ANOVA, the Kruskal-Wallis rank sum test, I established that the two routines which merge the majority of code, are significantly different across entropy, heterogeneity, and all three code-metrics.

92

Table 12. Summary of Cluster Characteristics

Direct Triaging Transferring Discourse-driven problem Tasks* Information* problem solving* solving* Code Metrics Average, net LOC attached 1.29 64.25 96.27 20.35 to pull requests (stdev) (49.31) (83.11) (458.58) (391.22) Average commits attached to 2.17 6.60 4.0 2.05 pull requests (stdev) (2.40) (7.54) (12.81) (1.81) Average files touched by pull 2.17 5.00 8.69 3.78 requests (stdev) (2.86) (3.67) (36.29) (5.34) % Merged 1.0% 1.5% 15.8% 86.9% Routine Characteristics Entropy (stdev) 0.07 (0.04) 0.12 (0.08) 0.45 (0.19) 0.67 (0.16) Heterogeneity (stdev) 0.04 (0.04) 0.11 (0.09) 0.23 (0.20) 0.14 (0.09) Typical # of Participants 1-2 1-2 2-4 1-2 Average Duration in days 52.80 75.08 74.62 14.56 (stdev) (106.81) (147.73) (154.51) (57.41) Average number of activities 2.22 4.11 8.52 6.94 (stdev) (1.47) (3.29) (7.04) (3.19) 97/686 63/686 366/686 160/686 N (% of total) (14.14 %) (9.18% %) (53.35 %) (23.32 %)

* The labels of the four routine types were assigned based on the qualitative inquiry performed in the next section.

Over half of the activity sequences (53.35%) fall within the discourse-driven problem solving routine, which at a heterogeneity of 0.23 has the most varied ordering of activities. It has a high degree of entropy (0.45), but not as high as direct problem solving

(0.67), indicating that in discourse-driven problem solving activities are repeated multiple times in iterative, redundant patterns to a greater degree than in direct problem solving.

Further, it also has the highest typical number of participants (2-4, whereas all other routines involve 1-2), the highest net LOC, on average touches the highest number of files compared to other routines, and also has a larger number of commits attached to

93

each pull request compared to direct problem solving. Last, its average duration is much longer than direct problem solving (74.62 compared to 14.56 days). This indicates that a large fraction of routine performances are temporally extensive, have varied ordering, involve multi-actor collaboration, and often lead to more complex code being written.

Information Processing Functions of Routine Clusters

Triaging Tasks The Rubinius project continuously receives a stream of bug reports and requests for various features. These are expressions of user’s experiences with the software. Hence, submitted bug reports provide the overall set of tasks that developers may work upon. At the same time choices must be made with regards to which tasks to focus on. Making these choices is done by leveraging a routine focused on reporting and triaging tasks, which serves as a sorting function where certain tasks are flagged as important, whereas others are rejected or ignored. Often the performance of this routine lead to submitted code being outright rejected, but in many cases it leads to a specific task being selected for further processing. Performances of the triage routine make up 14.14% of all routines and have the lowest entropy (0.07) and heterogeneity (0.04) of all clusters; only 1.0% of these performances lead to the merging of pull requests into the codebase. The low rate of merging indicates that this routine mainly serve an information search function. The low heterogeneity indicates that the structures provided by this routine are strict and highly economical. Further, they are often carried out by a single individual or a dyad. Below is an example of a typical triaging routine enactment:

94

strotnik4 opened this pull request: I have installed rubinius using rvm. Trying to require gems result in error… cataldo commented: In Ruby 1.8 rubygems isn't loaded by default. You need to require 'rubygems' before trying to load a gem. cataldo closed this pull request As we can see, the activity sequence is short and has low routine variety – a developer is having difficulties loading external packages (i.e. requiring ‘gems’, which is what the

Ruby programming language calls its packages), a solution is pointed out, and the activity sequence is closed. In summary, this routine represents the ways in which the community perceives and evaluates tasks. These tasks are often closed by this process, because the task is judged to be irrelevant. At times, tasks are forwarded to a different routine, such as discourse-driven problem solving or direct problem solving, for further processing.

The routine itself consists of a number of different types of activities, which developers combine in multiple ways. These are: reporting, solving externally, and rejecting. All of these activities serve specific information processing functions. Reporting displays specific bits of information to the community so that evaluation can occur. The two latter activities: solving externally and rejecting, are both final decision points where either sufficient routine variety has been provided by a different activity sequence (solving externally), or enough information has been provided to reject the task. Table 13 describes these activities.

4 All developer names are fictitious, to preserve the anonymity of the people being portrayed. However, I have attempted to choose nicknames similar in character to what is commonly used on Github, to preserve a sense of context.

95

Table 13. Repertoire of activities – Triaging Tasks

Activity Description Quote Reporting Initial reporting of a bug or “When running a script which creates threads, the agent feature request does not seem to be able to get a stacktrace.“ Solving Shows that the problem has “merb closed this issue from a commit: Don’t load other externally been solved externally (in a Hash when enabling HAMT. Fixes #1926” different activity sequence) Rejecting Closing a pull request by “Oops! Created a new pull request + issue combo. saying that the issue is not Closing!” relevant

Transferring Information

Traces of past performances of routines, archived on Github, carry information with regards to previous assessments. These are often helpful in making current decisions. The routine used to explicitly transfer such information across routines make up over 9% of the total number of routines. This routine utilizes a large amount of ‘reference’ activities, which means that information stored in the ‘memory’ of one activity sequence is transferred, or ‘grafted’ onto another activity sequence where it can be usefully applied

(often a discourse-driven problem solving or direct problem solving routine). Hence, this routine channels information from past performances to aid current performances, thus helping developers generate a more complete picture of the faced task. This routine has low entropy (0.12) and heterogeneity (0.11) relative to other routines, indicating that the behavioral templates it provides are relatively strict and simple. This routine only leads to the merging of code in 1.5% of the cases, indicating that it mainly serves a support function for locating information about past solutions, which may go far back in the past, as indicated by the average duration of 75.08 days or ongoing solution searches. As such, the transferring information routine plays a small but crucial mediating role in the overall

96

system of routines as it reduces the need for duplicate information processing. Below is an example of a typical information transfer routine enactment:

rohit opened this pull request: Improving README presentation rohit commented: Rubinus presentation in Github should be good looking! I just made some presentation changes in the README and added the..md extension to make it looks nice...hope you guys enjoy it. xzit commented: Oh, we've had that here before [links to previous attempts to change the readme file in a similar manner] rohit commented: =[ sorry. next time I'll look and the pull request history before pulling anything. i'll close it. and thanks! rohit closed this pull request This pull request was referenced This pull request was referenced This example shows us how developers use past knowledge stored in the ‘memory’ of one routine, and then transfer that knowledge for use in the performance of other routines

(i.e., facilitated by the ‘referenced’ activities). We can directly observe how certain activity sequences within this routine carry knowledge that can be used as input to other activity sequences. This shifts the complexity of a certain task to an alternative routine, where the task can be solved more easily.

The information transferring routine typically consists of a number of different activities, which developers combine in multiple ways. These are: reporting, transfer to direct problem solving, and transfer to discourse-driven problem solving. Reporting is the same activity as in the previous routine; however, redirect to discourse-driven problem solving or direct problem solving are ways in which information is transferred across routines – tying together multiple pieces of information to increase the capacity of developers to solve a task. Table 14 describes these activities.

97

Table 14. Repertoire of activities – Transferring Information

Activity Description Quote “When passing a callback to an ffi-bound function and this Initial reporting of callback is not the last parameter, rubinius crash[e]s Reporting a bug or feature when passing the callback, even if the callback itself is never request called.“ Referring to the Transfer to direct problem In the referenced activity sequence: “john referenced this issue” direct solving routine In the referring activity sequence: “john: Fixes #2082” problem which solves the solving “carl Thanks for your detail report :)” problem In the referenced activity sequence: “waldo referenced this issue” Referring to the Transfer to In the referring activity sequence: “Fixes #2011. discourse-driven discourse- problem solving Obviously, this is not finished. I'm new to Rubinius and I'm starting driven routine which to use it in development. I read the contributing guide and I know problem investigates the that I have to add specs for this (I'll investigate how to do it). If solving problem someone has a better fix, please send it! Anyway, I just want it to give it a try :)”

Discourse-driven problem solving Often the complexity involved in resolving a specific task is not known beforehand. In these situations, developers must inquire in detail into the software artifact in order to assess the scope and severity of the task and evaluate possible solutions. Often this is related to the presence of complex code showing unexpected and ambiguous behaviors due to unaccounted interdependencies:

“So like how your kernel is scheduling your threads, how your, I don’t know, how your memory behaves in your system, whether certain code or certain objects are in your CPU cache or not and all that kind of stuff comes into play and it’s way too complex. It cannot be consistent over each run, so maybe if you run it one time, it crashes in a certain place in your code. If you run it a second time, it crashes in another place in your code, so it’s something you have to like deal with and try to make sense of.” In such highly interdependent code, a larger number of software components become interrelated to each other in ambiguous ways. In these situations it is difficult to encapsulate a task in a specific, modularized way. Rather, the code remains fragmented, and hides underlying interdependencies creating a problem space where parts of the code are highly coupled with other parts of the code. Therefore, the degree of task complexity

98

cannot be assessed a priori. Rather, it is uncovered and constructed as developers start to perform the task through various probes such as fixing a particular bug or implementing a single specific feature. As developers leverage a range of simple solutions, “quick fixes”, interdependencies become more visible in the form of error messages and new bugs. This suggests that performing the task is likely to be non-trivial, involving a larger variety of activities, often ordered in unpredictable and heterogeneous ways. Therefore this routine exhibits the highest heterogeneity (0.23) and the second highest entropy (0.45). The high degree of heterogeneity is generated based on the heterogeneous ordering of problem- solving activities, as illustrated in the sequence below. The entropy is somewhat lower, compared to direct problem solving, because activities are iterated multiple times across the problem-solving process.

Discourse-driven problem solving comprises over half (53.35%) of all the activity sequences, making them the most common routine. This shows their prevalence in moving the development process forward and their criticality in managing complex tasks.

In itself, this routine does not lead to high degrees of code merging. Only 15.8% of the pull requests within this routine are merged into the codebase. However, the code that is worked upon has, on average, the highest average net LOC attached to each pull request

(96.27), and each pull request touches the largest number of files (8.69), compared to all other routines. Additionally, compared to direct problem solving, which accounts for the largest fraction of merged code, discourse-driven problem solving also has a larger number of commits per pull request (4.0 vs. 2.05), and stays open for a much longer duration (74.62 vs. 14.56 days). Through discourse-driven problem solving developers probe and manipulate artifacts so as to perceive hidden interdependencies within the

99

codebase or across internal and external artifacts. This provides the foundation for formulating specific demands to put on the code, after which the code can be rewritten to conform to those demands.

The high degrees of heterogeneity of these activity sequences speak to the richness of the problem-solving heuristics that they embody. The higher number of participants also suggests that this routine integrates more diversified knowledge and skills. Below is an example of a typical discourse-driven problem solving routine enactment:

matt_johnson opened this pull request: Hash refactoring matt_johnson commented: I could push this directly to master but since it's not a small change I decided to open the pull request to see if it's ok to do this kind of refactoring and if there's anything else I could do. All the specs pass and I personally don't see a problem on doing this since this way we get better code. What do you guys think? loki commented on 1 commit: Is the return value of each_item the proper return value for these methods then? matt_johnson commented: Yes, there are specs guarding this. … (4 additional comments) matt_johnson commented: #each_item shouldn't be exposed and if it's not exposed we can't use it in this case. matt_johnson commented: So, any thoughts on this? Is it ok to merge? Or there's something else to change? Shogun commented: "Shouldn't be exposed" is a value judgment, not a technical one. We try to limit adding methods that MRI doesn't have, but we break that rule whenever the situation merits it. MRI is written in C and uses invisible helper C functions all over. Rubinius attempts to write good OO code in the kernel and in this case, #each_item is a useful helper method for iterating. matt_johnson commented: So, I updated and reverted the commit where I made #each_item private. Shogun merged 1 commit Shogun closed the pull request The heterogeneous structure of this activity sequence is apparent. A developer proposes a change but asks for feedback, after which discourse ensues with regards to what is the

100

best way of rewriting the code to address the original problem. Through multiple, iterated activities, heterogeneity is generated, the task is framed, and a solution is constructed.

The routine itself consists of a number of different types of activities, arranged in three sequential, but iterative, stages: inquiring into interdependencies, formulating demands and responses, and making decisions. In the inquiry stage a problem is first reported, and the code is then probed to test which underlying interdependencies are active, as well as how interdependencies with user behavior and external artifacts may be affecting the exhibited, problematic behavior. Inquiry, thus, occurs when developers try to understand how a certain error is generated and which components of the codebase may be involved.

Through examining error messages and logs created when various function calls are being made, developers can garner clues as to what the cause of a specific error is, and arrive at a point where it is possible to formulate specific demands to which the code must conform. This rewriting happens through placing a number of demands on the code, which then need to be successfully responded to in order to create code of sufficient quality to be merged.

In the demand/response stage, many different demands and responses are common.

Providing specifications and documentation are responses to demands to make sure the code comes with appropriate ‘specs’ (tests), as well as appropriately styled commit messages that make the history of code revisions visible. Sequencing and adding features are responses to demands on the substantive functionalities of the code. Often some features need to be added only after certain other features have been implemented, and therefore need to be sequenced in a certain order. In other instances certain features seem to be missing from a code contribution, and in these cases core developers often ask the

101

person who submitted the pull request to amend the code to include additional features.

Last, code needs to be clear and perform well and code is therefore often subjected to demands to rewrite it to strengthen these aspects. In the last stage, decisions are made with regards to the code that has been generated. This leads either to rejecting or merging the code, or to further iterations of inquiring, demanding, and responding.

Table 15. Repertoire of activities – Discourse-driven problem solving describes these activities.

Table 15. Repertoire of activities – Discourse-driven problem solving

Activity Description Quote Inquiring into Interdependencies “It looks like "next" stepping in the Initial reporting of a bug or Reporting presence of raised exceptions sometimes feature request has problems.” “rohit: Looks like a method visibility context isn't honored when a method is defined inside a block Testing various commands to see what responses they Carlos: That might hint at the underlying Probing generate, so as to probe the issue, but I now wonder if it's far more significant than just affecting test/unit - - underlying problem methods defined in blocks but outside classes are somehow ending up considered as public instance methods[.]“ Examining if the problem “Rake seems angry about time extensions Examining occurs because of an external when loading up rake tasks for a new rails external artifacts artifact app.” “As near as I can tell without looking at your system is that you have both MRI 1.8 and 1.9 installed and when you installed rake, you did so with 1.9, so the rake gem binary wrapper is invoking 1.9. This is a Examining user Examining if the problem long-standing issue with the gems dumping error occurs because of user error bin wrappers into system directories. In Rubinius, gem binary wrappers go in a separate directory. RVM also handles this. I recommend only installing either MRI 1.8 or 1.9 via apt, but not both. Use RVM for multiple Ruby versions.”

102

Formulating Demands & Responses “phoenix: Can you please add some specs? Adding specifications (tests) to … Providing make sure code conforms to tertiary: Okay, thanks, I'll see what I can specifications requirements do.”

“IO_composer: Can you modify it in such a way that the original specs aren't added and then removed? Right now there is a commit that modifies files in both spec/ruby Improving clarity of commit and another directory, which makes Providing messages and history to ensure merging to rubyspec harder. documentation historical changes can be We also like to keep our history clean, so tracked that's another reason to have a nice and clean pull request. … john_B: updated” “Please hold off on more FFI [external Sequencing Adding features in an library] PRs. I'm going to work on getting features appropriate order the ffi gem to run on Rubinius.“ “eric: Did you also verify the behavior in Rewriting code to add missing 1.8 mode? Adding features features TMZ: I didn't enable 1.8 mode. Now, works fine. Thanks” Increasing code Rewriting code to increase “Can you add an appropriate fails tag…? clarity clarity Then I can go ahead and merge it in.” “lockfox: Iterating directly is a Increasing Rewriting code to increase performance improvement, we don't want performance performance to use each_item here. yoyo: Ok. I changed this.” Making Decisions Merging attached code into the “This pull request passes. Merging codebase woroo: Thanks!” I think your solution is better than me. :) Rejecting code as not suitable to Rejecting Thank you! Now, I can happily hack be merged into the codebase rubinius with default version as 1.9! Hehe.

Direct problem solving Tasks, which can be delineated to focus on specific, limited parts of code, are often well- defined and therefore simple. These tasks do not have strong interdependencies with other parts of the code. Such modularization leads to simpler problem spaces:

103

“So when you’re working on a complex behavior like this, what value do you compute for external encoding? If one big block of tests that tell it to test multiple things fails, you have to sort it out by hand, which is time-consuming and confusing. If you can make it very fine grained like this, then you can attack a very complex behavior one little facet at a time and you just get one passing, two passing, three passing and four passing. As you get each additional one passing, you still know that you haven’t broken anything you’ve already implemented.” The activity sequences, which conduct the work of searching and implementing solutions for already simplified problems represent roughly 23% of all the routines. This makes direct problem solving the second most common routine type. Here simple problems are addressed one by one and the code is written to solve each problem and then becomes merged into the codebase. This means that individual developers work to process the uncertainty pertaining to understanding the task and writing the code to provide a solution. This routine exhibits the second highest heterogeneity relative to other routines

(0.14), but the highest degree of entropy (0.67), and lead to the merging of code quickly

(14.56 days on average) and at a higher rate than any other routine (86.9%). However, indicators of code complexity such as average net LOC added to each pull request

(20.35), average commits attached to each pull request (2.05) and average number of files touched by each pull request (3.78) are all lower than in discourse-driven problem solving tasks. This indicates that the routine is used to solve relatively simple problems, using highly entropic sequences. These are efficient sequences, which contain the minimum number of activity types necessary to solve a certain problem, and therefore appear as highly entropic. Below I illustrate a typical direct problem solving routine enactment:

104

skogstroll opened this pull request: Add DBL2NUM macro for capi carl_c commented: This macro is standard in MRI and I had a gem that wants to use it. carl_c merged 1 commit carl_c closed this pull request Here a developer is adding a macro that is necessary for a package (‘gem’) that s/he wants to use. The submitted code is accepted without discussion and merged by a core developer. The routine itself consists of a number of different types of activities, which developers combine in various ways. These are: reporting, writing code, merging, and rejecting. While none of these activities in themselves are unique vis-à-vis the previously described routines, the routine as a whole is unique in the sense that it a) selects simple enough problems, and b) musters necessary amounts of routine variety to match the complexity of the task and thereby creates a successful solution. In Table 16 I describe each of these activities.

Table 16. Repertoire of Activities – Direct problem solving

Activity Description Quote Reporting Initial reporting of a bug “When the computer isn't connected to any network, some specs or feature request fail only on Rubinius. They don't on MRI.. Writing Attaching code written to “A RubySpec for the DateTime class had a call to should in the code solve a particular issue wrong place, which is fixed by this pull request.” Merging Merging attached code “Good catch, thanks! into the codebase c_gustafsson merged [the] commit” Rejecting Rejecting code as top_coder: “We've reworked these parts now and haven't seen unsuitable to be merged any of these crashes anymore. Closing this issue, if it's still into the codebase happening, please let us know!”

Routine Variety and Development Task Outcomes In Table 17 below we can see that, for the entire dataset (all four routine clusters), entropy predicts merging of code, whereas all other predictors have non-significant effects (controlling for Lines Of Code, ‘LOC’, number of commits, and files affected).

An increase of one standard deviation in entropy increases the chance of code being

105

merged by 98.44%. This indicates that varied activity types (entropy) are important for

driving the process of merging code. To understand the different factors that drive the

merging of code in the two routines which do most of the code merging – discourse-

driven problem solving and direct problem solving – I also perform logit regressions

using each of these partial datasets.

Table 17. Logit Regressions of Code Merging

All routines Direct problem solving Discourse-driven problem (AIC=324.91, (AIC=59.64, solving (AIC=142.08, R2=.75, N=686) R2=.71, N=160) R2=.69, N=366)

Coefficient Estimate S.E. p Estimate S.E. p Estimate S.E. p Intercept -2.81 0.29 <0.001 0.99 0.63 0.11 -4.44 0.56 <0.001 Entropy 4.15 0.36 <0.001 2.74 0.59 <0.001 3.56 0.50 <0.001 Heterogeneity -0.19 0.15 0.51 -0.42 0.52 0.41 0.40 0.17 0.02 Controls Net LOC 0.00 0.00 0.38 0.00 0.00 0.99 0.00 0.00 0.22 Commits 0.01 0.16 0.95 1.74 0.50 <0.001 0.17 0.09 0.05 Files -0.16 0.11 0.16 -0.09 0.05 0.08 -0.07 0.04 0.07 Comparing these two models (two rightmost columns of Table 17) we can see that in the

direct problem solving cluster, entropy predicts merging, but heterogeneity does not. The

entropy coefficient tells us that an increase of one standard deviation increases the chance

of the pull request getting merged by 93.93%. This makes sense, as this is essentially a

form of superposition where simple problems are solved by simple sequences of action –

essentially the minimum number of activities necessary to solve a certain problem. For

discourse-driven problem solving, we can see that both entropy and heterogeneity are

significant predictors. An increase of one standard deviation in entropy increases the

chance of the pull request getting merged by 97.23%, and an increase of one standard

deviation in heterogeneity increases the chance of the pull request getting merged by

59.87%. Hence, while entropy is sufficient for direct problem solving, to deal with more

106

complex tasks developers also need to add heterogeneity into routine activities. When

dealing with complex tasks, developers need to muster the requisite variety (both in terms

of activity types and their ordering) necessary to match the complexity of the tasks.

Hence, OSS developers deal with a range of complexity in the tasks they encounter, and

do so in different ways.

Table 18. Regressions of Duration

Discourse-driven problem All routines Direct problem solving solving (R2=0.07, N=686) (R2=0.23, N=161) (R2=0.11, N=366) Coefficient Estimate S.E. p Estimate S.E. p Estimate S.E. p Intercept 46.12 4.67 <0.001 15.66 6.04 0.01 70.60 7.72 <0.001 Entropy -32.10 4.77 <0.001 -26.47 4.10 <0.001 -49.27 7.91 <0.001 Heterogeneity 15.16 4.76 <0.01 7.57 4.19 0.07 -2.20 7.85 0.78 Controls Net LOC 0.00 0.02 0.87 0.01 0.01 0.44 0.03 0.04 0.54 Commits 1.26 2.20 0.57 -4.68 2.54 0.07 -2.61 2.97 0.10 Files -0.58 0.80 0.47 2.28 0.87 0.01 5.07 1.24 0.09

In Table 18 we can see that for the full dataset the effects of entropy and heterogeneity on

duration are inverse to each other. Higher entropy leads to shorter durations, while higher

heterogeneity leads to longer durations. More precisely, an increase in one standard

deviation of entropy decreases the duration by 32.10 days, while an increase by one

standard deviation in heterogeneity increases the duration by 15.16 days. Entropy leads to

more efficient processing of information uncertainty, while heterogeneity, being more

concerned with the ambiguity of information, tends to be less efficient. When looking at

the two code-merging clusters in the two rightmost columns, we can see that entropy

continues to exert a strong negative influence of duration, while the effect of

heterogeneity is insignificant. An increase in one standard deviation of entropy decreases

107

the duration by 26.47 days for direct problem solving, and by 49.27 days for discourse- driven problem solving.

Theorizing: A Process Model of OSS as an Information Processing System I found that OSS development projects contain multiple clusters of activity sequences and thus exhibit substantively different forms and degrees of routine variety, expressing distinct information processing functions, and producing code of varying degrees of complexity. This suggests that these routines also attend to varying degrees of task complexity. That is, not all performed OSS development tasks are simple.

To explain how OSS projects deal with tasks of varying complexity I asked the following questions: a) What distinct types of routines in terms of their heterogeneity are present within OSS projects? b) What information processing functions do each of these routines serve for dealing with development tasks of varying complexity? c) How do routines, when carried out as a joint aggregate system enable OSS developers to tackle more complex tasks? In answering these questions I identified elements of the internal routine structure of an OSS development process, and showed how routine systems complement previously theorized coordination mechanisms such as modularization and incremental layering of code – with the caveat that such routines emerge from local interactions rather than being formally specified. Thereby I make novel observations with regards to routine variety in OSS and its relationship to successful task completion and efficiency (e.g.

Capiluppi et al., 2005; Godfrey & Qiang, 2000; Robles et al., 2005), which enables us to go beyond recent accounts of coordination in OSS.

By enacting specific routines, OSS developers are directed to tackle tasks of specific nature and complexity. The routines also guide which tasks should be deferred as well as

108

provide means to reduce the complexity (attenuate) of faced tasks. I theorize that these routines help deal with growing complexity by evaluating, shifting, and matching complexity, thereby increasing (amplifying) the capacity to tackle the complexity associated with each development task. Accordingly, interdependencies between routines within distinct clusters illustrate how task coordination is accomplished within an OSS project when it tackles a stream of development tasks that are interlaced across performances of multiple distinct, complementary, and emergent routines. As such, these routine performances are configured on-the-fly as responses to identified task complexity.

Moreover, the complexity is primarily discovered as the tasks are worked upon. When the degree of complexity of the task is gradually uncovered, developers shift these tasks to routines with appropriate configurations of entropy and heterogeneity so that the underlying complexity of tasks can be managed and finally accommodated. Through this analysis I explain how distinct routines emerge during OSS projects, what types of routines there are, and how they help OSS developers to cope with complexity. Rooted in my analysis of the four routine classes I next combine these classes into an aggregated routine ‘system’ and propose a process model of an ‘ecology’ of OSS routines which can respond flexibly to the varying development task complexity (Figure 9).

Due to their positioning within the overall routine structure, each distinct routine plays a separate, yet complementary function. Each stage within each routine is labeled with a number, 1 through 9. Triaging focuses on the initial part of the process by (1) reporting problems, which are then either judged to be already solved and therefore end in external problem solving (2), or to consist of (3) faulty or irrelevant code and therefore rejected. If the problem is still unsolved, code has not been contributed, and the problem is deemed

109

interesting and relevant, the problem is taken forward by one of the three other routines.

The (4) transferring information routine takes (1) reported problems and refers them to either a direct problem solving routine, or more specifically to its (5) code writing stage, or to the (7) discourse-driven problem solving routine. The problems, which are handled by the transferring information routine, have often had their resolution delayed, and the forwarding happens at a later point in time compared to the initial reporting of the problem. The direct problem solving routine starts out as other routines, through (1) reporting problems, and then passes through stages of (5) writing code which often is (6) evaluated as satisfactory, leading to merging of the code into the codebase. Code, which is not deemed to be satisfactory is entered into (7) an inquiry process, and becomes handled by a discourse-driven problem solving routine. After interdependencies have been inquired the routine iterates between (8) formulating demands and responses, after which (9) decisions with regards to iterations, rejection, or merging.

110

Figure 9. Aggregated Routine Structure

111

For example, in pull request #1510 (an instantiation of the transferring information routine) a developer submits a crash report. This crash report is then referenced

(information about the crash is transferred to a discourse-driven problem solving routine) in pull request #2025 which suggests a code patch (“rbx report is overwritting..rubinius_last_error if the submission fails. Fixes #1510”) that may fix the issue reported in #1510. Brief probing is conducted to understand failures which may be related to, but not intrinsic to the original crash report (“can you check failures of CI?”).

Discourse-driven problem solving establishes that the observed failures are unrelated to the original crash report, and the multiple cycles of inquiry exhibits an increased degree of heterogeneity. Finally, a developer rewrites the code to provide a fix, and closes the pull request. In this example we can see how developers coordinate their work around a specific technical problem by first evaluating it, and then shifting complexity towards a discourse-driven problem solving routine within which the problem is inquired into and adequate variety is generated so as to match the complexity of the overall task. Hence, as the complexity of the task is uncovered, developers can leverage upon more heterogeneous routines as to match the emerging task complexity.

Information processing responses to complexity Together the four routine clusters form an interdependent information processing system, where the complexity of tasks is dealt with in four distinct ways. As such, all four routines participate in the process of identifying and creating solutions to tasks, but in different ways. This helps us reconcile the claim, which extant accounts have made, with regards to only dealing with simple tasks with observed process variation. As tasks exhibiting different degrees of complexity enter into OSS development agenda,

112

alternative configurations of routines within the system are generated to match the varying complexity of incoming tasks. Thus, routine variety is an outcome of the multiple ways in which developers learn to evaluate, shift, and match the complexity of incoming tasks – both simple and complex (see Table 19).

Table 19. Information processing responses to complexity

Routine Hetero- Entropy Information Description Cluster geneity Processing Function Triaging Lowest Lowest Evaluating Evaluates and controls complexity through Tasks Complexity selecting certain tasks for further processing while rejecting others Transferring Low Low Shifting Shifts complexity to a discourse-driven Information Complexity problem solving or direct problem solving routine based on recorded knowledge of past performances Discourse- Highest High Matching Matches complexity through aligning driven Heterogeneity to heterogeneous routines to the ambiguity of problem Complexity information in given task complexity solving Direct High Highest Matching Matches complexity through aligning entropic problem Entropy to routines to the uncertainty of information in solving Complexity given task complexity Triaging Tasks deals with complexity through evaluating it. Through evaluating incoming tasks with regards to their fit with current development goals as well as importance, parts of incoming complexity are surveyed and selected for further processing, while other parts of incoming complexity are rejected. Triaging seeks to select certain tasks for further processing, and therefore the total intake of complexity is also controlled, as some tasks simply are not taken on – they are rejected, deferred, or shown to already have been solved.

Transferring Information deals with complexity through shifting it from one routine to another where it can be addressed more effectively. Usually information is referred from

113

a less varied routine (triaging tasks) to a more varied routine (e.g. discourse-driven problem solving) so that task complexity can be matched by routine variety.

Discourse-driven problem solving addresses task complexity through leveraging high degrees of routine heterogeneity. Through several iterative stages, developers seek to match (amplify) routine heterogeneity to the ambiguity of information involved in a specific task. Overall, in this routine multiple developers conduct collaborative inquiry so as to make sense of the task at hand (Crowston & Kammerer, 1998; Weick, 1995), entailing distributed information processing and generating extensive flows of information and knowledge. The complexity of a task is discovered as the task is constructed – essentially forming a specific task out of many possible tasks. During this process ambiguity is reduced through the application of heterogeneity. To the degree to which developers are successful in doing this, a discourse emerges, which constitutes the generated heterogeneity which eventually provides novel solutions to faced problems.

This routine class represented over half of the routines in the study period, which shows that the bulk of routines in any given OSS project consist of various attempts to generate variety to grapple with the ambiguity of task complexity.

Direct problem solving represents the means through which individual developers use efficient means to match (amplify) routine entropy the uncertainty of simpler problems.

This is possible because other routines have sorted out more complex tasks and dealt with them elsewhere, leaving a smaller set of simpler tasks that can more easily be solved.

This enables developers to leverage entropic sequences that support the writing of code.

Such routines are focused on dealing with the uncertainty entailed by the need to process multiple bits of known information. Direct problem solving essentially mirrors the

114

‘superposition’ development routine, enabled by modularization and layering of code. In essence, highly entropic routines provide the most efficient means possible for solving relatively simple problems. However, this routine is only made possible through its inclusion in a larger system of routines, which constantly evaluates, shifts, and matches complexity through creating routines of varying entropy and heterogeneity.

Structure of successful development processes Key to the model is that successful coordination and completion of development tasks

(i.e. merging, as opposed to rejecting or abandoning suggested code contributions) is achieved in two different ways across direct and discourse-driven problem solving. In the former, entropy is the only real driver of successful task completion – indicating that processing uncertainty is sufficient with regards to solving relatively simple problems. In the latter, routine heterogeneity is also important. This indicates that such routines involve iterative cycles of information processing geared towards inquiring into ambiguous and sometimes conflicting bits of information – characteristics of more complex development problems.

This finding is further corroborated by the fact that both entropy and heterogeneity influence the duration of pull requests in such a manner that higher entropy tends to drive shorter durations, whereas heterogeneity tends to increase the duration of pull requests.

Hence, the entropy of routines tends to represent more efficient forms of information processing, whereas heterogeneous routines contains more iterative information processing cycles indicative of discourse-driven problem solving, and are therefore less efficient but richer in structure.

115

Discussion The main insights of this study can be distilled down into three main contributions: a) extending previous accounts of coordination in OSS through opening up the black box of coordination processes, b) explicating the role of routine heterogeneity and discourse in tackling complex tasks, and c) showing how information processing routines emerge in the context of OSS to coordinate the solving of complex tasks.

While it may seem as if developers simply deal with simple problems (Howison &

Crowston, 2014), the ecology-of-routines perspective extends this story. I argue that developers build systems of routines, which generate requisite degrees of entropy and heterogeneity to match the varying uncertainty and ambiguity (i.e. task complexity) related to development problems. Here, leveraging heterogeneity to match the ambiguity of complex tasks is necessary, because such tasks are not pre-defined, rather, their structure is recursively constructed through discourse. This insight only becomes exposed after a routines perspective has been adopted, which is why it has been obscured in previous research. By showing this, I further open up the “black box” of OSS development (Howison & Crowston, 2014).

Opening this part of the black box gives us insight into the role which differing configurations of routine variety – entropy and heterogeneity – play in distributed, digital innovation activities (Yoo et al., 2010). Previous studies (Salvato, 2009) suggest that more homogenous routines lead to higher performance in design projects – due to stabilized attention and mindfulness. My account provides partial support for this statement – highly entropic routines offer efficient means for direct problem solving.

However, my account also tells a different story – routine heterogeneity is necessary to

116

match the inherent ambiguity of complex design tasks (Jason et al., 2013). Since such tasks often are complex and therefore need to be framed or constructed as they are being performed, routine heterogeneity becomes an essential characteristic of such problem solving tasks – as is illustrated by the discourse-driven problem solving routines within the Rubinius project. Hence, there are many conceivable conditions under which higher degrees of routine heterogeneity are preferable.

Lastly, task complexity is commonplace in software development (Xia & Lee, 2005). It is also widely known that coordination of such complexity is more difficult under conditions of dispersed teams and dynamic requirements given the stickiness of software and domain knowledge (Lee et al., 2013). These are exactly the conditions under which

OSS developers work. The coordination difficulties arise because of the presence of a multitude of possible solutions, and approaches to solutions (and conflicts between these variants) that complex tasks imply (Campbell, 1988) and the constant challenge of clarifying domain assumptions. Hence, explaining how developers can respond to task complexity is key to understanding how OSS projects coordinate themselves. Though coping with complexity through coordinating collective action has been an important concern in IS development at large (Benbya & McKelvey, 2006), the study of OSS comes with the added caveat that we need to illustrate how structures, such as routines, emerge from the numerous interactions between developers (rather than being explicitly formulated to guide their interactions). As developers attempt to cope with tasks of varying complexity they organize into activities to evaluate, shift, and match the complexity of tasks. Through this process, entrained patterns emerge – routines that fulfill each of the information processing functions necessary to separate out simple and

117

complex tasks, and generate the requisite variety to solve both. Hence, I contribute through extending IPV to explain not only the conscious formation of structures

(Galbraith, 1974), but also their emergence in self-organizing contexts. This opens up further exploration of information processing structures in contexts where direct control over such structures is untenable.

In conclusion, the law of requisite variety is simple, yet deceptive. It’s implication for

OSS is to state that developers must muster sufficient amounts of variety to address the environmental complexity they face either by attenuating it or amplifying their capability to generate variety. The challenge for researchers is to explain how requisite variety is actually mustered. In this study I have argued that specific routine configurations form important sources of variety. Many see OSS as a harbinger of future forms of organizing founded on loosely coordinated networks of autonomous individuals integrated by digital capabilities. Therefore the study of how routines coordinate complex tasks in OSS contexts can help understand coordination in novel forms of organizing that are arising in the digital age.

118

Study #2: The Evolution of Routines as Responses to

Environmental Shifts

From its earliest days, software development has been known to follow a cyclical development pattern of inception, development, and decay. Similar patterns are repeated in release control of software. Immediately after each release of software, new requirements arise (often in the form of bugs in the code) and a process of identifying requirements, designing and coding the software, as well as testing ensues. A variety of methodologies have been proposed and enacted to improve this process, from the classic waterfall model (Royce, 1970) to various lifecycle models (Davis & Olson, 1984) including the spiral model (Boehm, 1988) or agile approaches (e.g. Highsmith &

Cockburn, 2001). Each prescriptive and descriptive analysis of software development highlights the presence of a ‘lifecycle’ of development processes that is repeated in the never-ending series of software releases.

Open Source Software (OSS) development processes, however, are typically not conceived of in terms of lifecycles. Since OSS projects can involve thousands of independent, typically volunteering, developers, a rationalized, predictable lifecycle processes seem out of the question. Once a piece of software is made available to the

OSS community, OSS developers self-organize to address bugs and issues in an emergent manner as they arise, and through this mechanism they continuously and incrementally extend and strengthen the code (Raymond, 2001). OSS developers postpone major, complex problems and instead pursue incremental, manageable contributions. These incremental contributions layer upon each other in a sedimented fashion and enable the

119

development of complex software (Howison & Crowston, 2014). Although existing research provides insight into various processes that OSS communities follow, we have little understanding of how these processes vary across the release cycle and whether they exhibit any stages. While OSS projects have releases, we do not know if there is a lifecycle pattern between these releases.

To investigate the presence of temporal patterns of activity in OSS development, I conducted a longitudinal, exploratory, mixed methods computational analysis of a major

OSS project, Ruby on Rails. Overall, I examined the project’s 126,182 activities over 28 months (May 2011 to August 2013) using computational sequence analysis techniques.

These activities comprised 11,846 activity sequences, were clustered into two classes of

‘routines’ (repeated sequences of activities; Pentland, 2005) – discourse-driven and direct problem solving routines. I found that the relative distribution of these routines is associated with the intensity of discursive shaping of software features occurring throughout the release cycle. The ebb and flow of discourse throughout the project forms the distinct stages between each release. I identify the following stages: a) cleanup: exhibiting a balance between discourse-driven and direct problem solving routines, b) sedimentation: dominated by direct problem solving routines, and c) negotiation: dominated by discourse-driven problem solving routines. This suggests the presence of a lifecycle within OSS release processes.

The rest of the paper is organized as follows. First I briefly review the literature on practices and routines in the OSS context and link this to the notion of routine heterogeneity, followed by my case study. The case study explains my qualitative and computational findings with regards to routine heterogeneity across the release cycle. I

120

then theorize the role of, and generative mechanisms behind, routine heterogeneity. Last,

I validate the findings using Social Network Analysis (SNA) and discuss the theoretical and practical implications of my inquiry.

Lifecycles in OSS Development The practices of OSS communities are commonly based on the original ‘bazaar’ metaphor (Raymond, 2001) – where a mass of developers independently generate solutions to small problems that emerge as developers “scratch their own itches.” Related and preceding practices, such as the shaping of features (traditionally labeled

“requirements engineering”) are usually understood to be conducted within a discourse among developers (Hirschheim & Klein, 1994; Winograd & Flores, 1986; Winograd,

1987) using various ‘informalisms’ (Scacchi, 2009) – blog posts, forum discussions, ideas stored on bug trackers etc. Such activities are coordinated through a process of

“open superpositioning” which shows how developers learn to solve small and independent problems in a piecemeal fashion which are layered on top of preceding incremental solutions eventually resulting in complex, large-scale code (Howison &

Crowston, 2014). Overall, the picture of OSS development that emerges from the literature is one of a continuous, seamless accrual of individual features emerging from free-flowing discourse that generates solutions to encapsulated, incremental problems.

As such, OSS development is thereby conceptualized as a ‘mode’ or ‘approach’ of development rather than a structured process enacting a methodology that transforms itself across time, stages, or cycles. Further, software development practices are rarely conceptualized in terms of iterative cycles of discovering and implementing requirements through a release. I refer to repeated temporal patterns of activity within a software

121

release generically as “life cycle models”. Most proposed models of software development are founded on the idea of a life cycle (Boehm, 1988; Highsmith &

Cockburn, 2001; Royce, 1970), but there is no model assuming lifecycles between releases in OSS.

There are indications that OSS projects (rather than processes) evolve or grow through stages – often coached in terms of growth. Studies have explored temporal organization of software development practices in OSS projects (Lee & Cole, 2003) and it has been shown, for example, that the patterns associated with the size of the OSS codebase changes can follow a variety of patterns such as linear (Robles et al., 2005), superlinear

(Godfrey, 2000), and punctuated equilibria (Capiluppi, Faria, & Ramil, 2005). Similarly, successful OSS communities grow over time and the community growth occurs in recognizable stages (Capiluppi et al., 2007). Understanding such growth stages is important, but it only provides insights into a particular aspect of OSS project’s change over time – the magnitude of growth. From the extant literature on OSS we should also expect to see qualitative changes across communities and over time in terms of how their practices are structured. For example we may want to understand how everyday practices, the lifeblood of OSS (Monteiro & Østerlie, 2004), such as requirements engineering change over time. Therefore, we need to look not only to the change in magnitudes such as growth of the codebase, but also to changes in the structuring of OSS practices.

Structural variations across OSS communities (Crowston & Howison, 2006) suggest that practices should consistently differ across projects and time points. Further, based on what little we know of lifecycles in online peer production in general (Kane et al., 2014) and OSS specifically (Capiluppi et al., 2007), we would expect that development

122

practices are likely to differ across different periods of a release cycle. However, no temporal patterns of OSS activities within release cycle have been examined or identified in the literature.

Patterned sequences of OSS activities can be viewed as organizational routines (Pentland,

2005) – persistent, reproduced enactments of activity distributed across a community. We know from past research that these patterns are sources of both change and stability

(Feldman & Pentland, 2003). It is therefore to be expected that such patterns shift their characteristics over time as a response to various exogenous and endogenous factors. One way of characterizing the temporal structure of such patterns is to capture the level of their heterogeneity over time (Salvato, 2009). Heterogeneity refers here to the overall structural variation of routines – i.e. variation in the activity types and their sequential ordering. Next, I will build upon this heterogeneity construct.

Explaining Heterogeneity in OSS Development Routines Across Time The notion of heterogeneity is used to approximate the dynamism and diversity of a routine. It is important to the study of routines because dynamic, diverse, and complex routines are necessary to respond to ambiguous situations (Hærem, Pentland, & Miller,

2014). Routine diversity thus increases the robustness and resilience of the work system

(Page, 2010). Overall, heterogeneity in routines represents a higher degree of information processing capacity, which enables communities and organizations to adaptively deal with more difficult problems.

Some previous studies imply that routine heterogeneity is low in OSS contexts. Howison

& Crowston (2014) for example posit that OSS artifacts are constructed by solving small,

123

independent problems using simple local processes implying that OSS routines do not exhibit a large degree of heterogeneity. However, the extant literature also suggests that varying degrees of routine heterogeneity is present (e.g. Godfrey & Qiang, 2000). Even in such cases, it is unclear what constitutes and drives routine heterogeneity.

The extant literature has attempted to explain the drivers of routine heterogeneity in multiple ways – none of which are sufficient to explain temporal or spatial variation in the heterogeneity of development routines across and within OSS projects. These attempts center primarily on the value of strict ostensive routines (Schroeder, Linderman,

Liedtke, & Choo, 2008), limited cognitive focus (Salvato, 2009), effects of digitalization

(Gaskin et al., 2011), and the presence of specific evolutionary mechanisms (Lee & Cole,

2003). While each of these provide plausible explanations for why routine heterogeneity may vary in specific contexts, they are, however, generally not suitable to analyze routine heterogeneity in the OSS context as I will show below. Most of the caveats that make their application problematic relate to the fully digital and volunteer-based nature of OSS development.

Some factors have been theorized to decrease heterogeneity, such as strong ostensive routines. For example, Six Sigma or Capability Maturity Model (CMM) approaches rely on strong routines which have the aim to decrease routine heterogeneity and thereby increase the quality of work (Schroeder et al., 2008). Minimization of routine heterogeneity is achieved through strict process models where types of activities are sequenced in a particular way. Accordingly, workflows become standardized and the potential for overall routine heterogeneity is reduced. However, such strong ostensive routines are often not possible to specify in OSS contexts (Scacchi, 2001).

124

Another factor associated with decreased routine heterogeneity is their role in stabilizing cognitive focus (Salvato 2009). As organizations limit the span of attention and related cognitive effort this results in the homogenization of organizational activity and reduces exploration. Stabilization and homogeneity of routines also leads to a greater economy of activities. However, such subjectively constructed and collective states of mind are difficult to engender and control in virtual contexts such as OSS. Further, the minimization of routine heterogeneity may not even be desirable for the viability of the

OSS community and attendant software artifacts.

One explanation of increases in routine heterogeneity involves the introduction of digital tools in routines. Digitalization may under specific circumstances increase heterogeneity in various design processes (Gaskin et al., 2011). The proposed mechanism operates through the increased opportunities for varied combinations of design actions that digital tools afford if the design environment is turbulent and there is low decision centrality.

However, most OSS work is carried out with digital tools. Therefore this is not a useful explanation to explain temporal or spatial variation in routine heterogeneity in OSS projects.

Last, some analyses of evolutionary mechanisms could explain the differences in variation. Such mechanisms have been argued to lead to increased heterogeneity – meaning that higher degrees of heterogeneity (at least in natural systems, but conceivably also in socio-technical systems) leads to higher degrees of robustness, and therefore survivability and performance (Page, 2010). This line of reasoning would lead to the conclusion that heterogeneity is generally driven by the presence of evolutionary mechanisms (Lee & Cole, 2003). However, simply posing that evolutionary mechanisms

125

are at play is not sufficient to explain how heterogeneity is created and why it varies under different local or temporal conditions. To do this, we need to inquire into the specific mechanisms that drive variation, selection, and retention and their linkages to varying degrees of routine heterogeneity.

All in all, previous explanations of heterogeneity are problematic when applied to the

OSS context. They are problematic, because they either rely on mechanisms that are largely unavailable to OSS developers (i.e. specifying strong ostensive routines), mechanisms which are difficult to maintain in OSS contexts (i.e. stabilized levels of cognitive focus), or aspects which are common to all OSS projects (i.e. degree of digitalization and evolutionary mechanisms). Therefore we need to continue my search for the mechanisms which generate temporal routine heterogeneity in OSS projects.

A Longitudinal Study of OSS Routines: Shifts in Routine Compositions To understand the role of routine heterogeneity across the release cycle this study combines qualitative inquiry using mainly archival text data to analyze mechanisms and conditions that generate routine heterogeneity and computational analysis, using digital trace data to analyze the structural variation of routines and networks of developers. The overall process is described in Figure 10 below:

Figure 10. Computational-Qualitative Research Method

126

Case Selection Ruby on Rails (or simply ‘Rails’) is possibly the most popular web development framework in the market. It’s an OSS project started in 2004 by David Heinemeier

Hanson (often referred to as ‘DHH’) for use by his company, 37signals, to build a specific web application: Basecamp – a project management tool. The Rails framework is built upon the Ruby programming language.

I selected Rails as a case study for four reasons: 1) it is a typical example (Eisenhardt,

1989) of a relatively successful OSS project; 2) it has had multiple official releases and has gained traction within the larger web development community; 3) the project is medium-sized, but socially complex since it is essentially an evolving collation of different technologies; 4) the project is well documented in online archives, which means that I was able to access hundreds of hours of conference presentations, mailing list archives, blog posts, and other forms of publicly available sources of data – presented in the same format as participants access them – in an online format.

I therefore argue that understanding the dynamics of this case helps us gain further insights (Yin, 2008) into the emergent routines of medium-sized, complex OSS projects, which often lack the elaborate and explicit governance structures shared by the largest projects of the OSS world (e.g. Linux, Apache, Mozilla).

Computational Analysis Here I describe the collection and analysis of computational data. My efforts focused on determining the characteristics of routine structures as well as their distributions across time.

127

Extracting Digital Traces Routines are rendered measureable through Github, which records time-stamped sequences of all activities, associated actors, and activity content, all organized by pull requests (suggested code changes). I operationalize a pull request as a sequence of activities connected by a common Pull Request Identification number (PID). Pull requests are used by developers who do not have access rights to edit the repository directly, or when developers want feedback on their suggested code changes. A pull request can contain multiple commits (i.e., detailed changes to specific pieces of code).

This is code that a developer has merged into his/her own, local copy of the code, and now wants the project owner to ‘pull’ so that it can be merged into the common baseline copy of the code. The request is then discussed and renegotiated before a decision is made whether to merge the code into the baseline code, to reject the request, or to postpone work on the request. The types of activities that are logged are specified by the

Github platform itself, and are defined by the designers of the platform and are detailed in

Table 22.

In order to extract the digital traces of activities (Anjewierden & Efimova, 2006), I developed several scripts using the data mining toolkit by Gousios & Spinellis (2012), each of which are captured in Appendices A-C. These scripts capture every activity related to each pull request during the given time period. I extracted 11,846 routine performances consisting of 126,182 activities. Because analysis on such a large dataset exceeded my computing resources, I randomly sampled 10% (1,184 routine performances, 12,765 activities) of the activity sequences and conducted all subsequent analyses on this sample. All of these activity sequences can be analyzed computationally

128

as well as qualitatively as texts of bug reports, discussions around how to fix bugs, and how the eventual code fixes were done.

Establish Longitudinal Patterns Computational analysis of activity sequences – called sequence analysis – permits us to analyze the heterogeneity of activity patterns (Gabadinho, Ritschard, & Studer, 2011).

These methods have emerged in sociology mainly to investigate life courses and careers

(Abbott & Hrycak, 1990) or spatial movement of actors (Wilson, 2001). Results of sequence analysis can be further organized into compositionally similar clusters of activities (e.g. Sabherwal & Robey 1993). The clustering of sequences and the measurement of heterogeneity are derived from the calculated distances between sequences. In sequence analysis, this distance represents the degree of dissimilarity between two sequences as calculated using Optimal Matching (OM) methods. Consider the following two pull request sequences (1 and 2) of Github digital trace data:

1: open / comment / merge

2: open / comment / close

In order to measure the similarity in these activity patterns – the extent to which sequences align, or are similar – we need to estimate the effort required to transform one of the sequences into the other. In this example, we have to replace the ‘merge’ activity in sequence 1 with a ‘close’ activity to arrive at sequence 2. Assuming that the cost for a single conversion is set to 2, the total cost or distance between the sequences is 2 (Abbott

& Hrycak, 1990). Converting sequences with more differences results in higher distances, which means that the sequences are more dissimilar while lower distances indicate more similarity. The total distance between two sequences is called the OM distance. The OM

129

distances between every pair of sequences in a given set constitutes a distance matrix. By exploring the distance matrix we can observe that a set of sequences with only small distances implies low heterogeneity. We are more likely to observe this when sequences are short or when activity sequences follow common patterns. In another set of sequences we may observe larger distances, implying greater heterogeneity. We are more likely to observe this when sequences are long, and exhibit unique patterns in the ordering of activities.

Computational Elicitation of Routines Clustering algorithms use the distance matrix as an input to derive clusters of sequences

(in my case, pull requests) based on their OM distances. The emergent clusters can then be considered routines – families of similarly patterned activity sequences. The clustering algorithm does not by itself specify the number of clusters, but provides goodness of fit statistics for evaluating multiple clustering solutions. In general, these fit statistics evaluate how well each clustering solution reproduces the original distance matrix as well as how convergent and discriminant the clusters are (Studer, 2013). As suggested by

Studer (2013) I used several standard statistics to determine a suitable number of clusters:

Point Biserial Correlations (PBC), Hubert’s Gamma (HG), and Average Silhouette Width

(ASW). PBC and HG measure the capacity of the clustering to reproduce the distance matrix; whereas ASW measures the degree to which distances between routines are small within clusters and large between clusters.

Last, using sequence analysis I can extract “representative sequences” for each of the clusters. These are actual sequences within the dataset that lies close to the theoretical center of each cluster – meaning that they minimize the OM distances to all other

130

sequences in the cluster. Therefore, these sequences can be taken to be representative – a sort of mean or median – of the sequences within each cluster. These computationally derived representative sequences are then used for qualitative coding to derive the characteristics of each routine cluster.

Descriptive statistics and metrics After extracting clusters, I moved on to extract descriptive statistics for each of the clusters. These statistics were extracted using a combination of the already analyzed sequence data, query scripts, and inferences drawn from the coding of the qualitative data. To measure the extent and number of activities related to each pull request, I calculated their average duration, average number of participants, and the average number of related activities. To capture the successful coordination of problem solving activities, I calculated the percentage of pull requests which were eventually merged into the codebase. To assess the complexity of code I calculated the average Lines of Code

(LOC) added and deleted by each pull request, average numbers commits (distinct code patches) attached to each pull request, and average numbers of files modified by each pull request. The degree of heterogeneity of each cluster is calculated as the average of all the OM distances within that cluster, normalized by the length of the sequences. This indicates the overall heterogeneity in the sequential ordering of activities and their types.

Last, I plot the occurrence of routine performances within each cluster, so as to ascertain the distribution of routines across time. This is done through utilizing the inception date of a routine (when the issue or pull request was opened) as a time marker.

131

Qualitative Inquiry Here I describe the collection and analysis of qualitative data. My efforts focused on publicly available archival data, supplemented with a smaller number of interviews.

Interviewing and Collecting Archival Data The dataset manly consists of archival data. There is a wealth of public data sources available on the Rails project. This includes hundreds of hours of video and audio, as well as thousands of blog posts and other online articles. To sort through this data I made a listing of a) the most important conferences, b) podcasts, and c) blog and news sites. I surveyed this data to identify video-recorded presentations, specific podcast interviews, as well as blog posts and other articles which concerned itself specifically with the development of new features in Rails during the studied time period. The data sources which were selected in this manner were then subjected to open and axial coding, to identify emergent themes with regards to new features development.

Specifically, these data sources include audio (podcasts and interviews, ranging in duration from 1 to 2 hours each), video (conference keynotes, panels, talks, and interviews, ranging in duration from 30 minutes to 3 hours each), and text (blog posts, interviews, and documentation). I also collected all emails from the Rails mailing list.

Interviews were transcribed, public audio was already transcribed by external parties, and specific sections of public video were transcribed based on their relevance to my inquiry.

For interviews I sampled 13 informants from each part of the Rails developer community, including core team members, regular contributors, and peripheral contributors.

Interviews were dynamically structured to tap into each interviewees’ actual experience of developing code for Rails. The duration of each interview ranged from 45 minutes to

132

1.5 hours and common questions were of the type: “Please describe a recent contribution you made. What did you do? What were you thinking?” These interviews served mostly as a way of helping to sort through the primary source of insight – the archival data. See

Table 20 for a summary.

Table 20. Data Collection

Measure N Comment Rails core developers, community members, and peripheral Interviews 13 developers Public Audio 11 Podcasts and other audio interviews conducted with core developers Public Video 18 Videotaped conference talks conducted by core developers Public Text 20 Blog posts and interviews conducted with core developers Mailing list conversations between core developers, developers, and Public Email 225 users Routines 1,184 Distinct pull requests on Github (Activities) (12,765)

Map out Release Cycles Using archival data, both textual as well as descriptive statistics available from the

Github version control platform, I mapped out the timing and characteristics of the various releases. This includes descriptions of the new features and fixes implemented in each of the releases, as well as the magnitude of changes in terms of files, commits, lines of code added and deleted, as well as the number of contributors. This mapping of releases and their properties serve as a basic temporal landscape which will help us in interpreting the changes in routine structures which occur over the release cycle.

Code Representative Sequences Using sequence analysis I statistically identified representative sequences from each routine cluster. For each cluster 10% of the representative sequences were coded using a grounded approach (79 sequences for discourse-driven problem solving and 40 sequences for direct problem solving) to identify emergent themes and categories. This helped to

133

establish the qualitative characteristics of various routines – what developers are actually doing and accomplishing within each of the routines. I present this as the “repertoires of activities” which each routine provides to developers.

Code Contextual data Contextual data was coded to get a sense of the mechanisms, which shapes routine structures, i.e. the requirements determination discourse which unfolds at an increasing rate across multiple arenas. Through open and axial coding emergent categories were identified which captured the ways in which developers inquire into, understand, and debate various features being considered for implementation. The coding yielded categories such as “discursive shaping”, “attention to concerns and arguments”, and

‘controversy’.

Zooming in/out & Theorizing the Mechanism I sought to identify broad patterns of routines using the computational analysis and then looked to elaborate the local origins and mechanisms that underlie these patterns which enabled me to pay in-depth attention to specific contexts under which those patterns arise.

Gaskin et al., (2014) refer to this as “zooming in and out” of routines and their contexts – a practice that helps to achieve conceptual and theoretical consistency. For example, the structure of “discourse-driven problem solving” was compared with the contextual nature and content of discourse unfolding across blogs and mailing list conversations. This helped establish the contextual understanding of origins of specific patterns of routines.

Through establishing consistency across data on routines, contextual data, as well as results of qualitative inquiry I will craft a theoretical account that explains why routine structures changed over time in the studied OSS project the way that they do.

134

Validation using SNA Last I used SNA analysis to show that increased interaction across the community is associated with higher degrees of routine heterogeneity and signifies a greater prevalence of the “discourse-driven problem solving” routine. All developers who perform an action in relation to a pull request becomes incorporated as a ‘node’, and ‘edges’ or links between nodes are established between two developers when one developer performs an action directly following the action of another developer on the same pull request. I used the following network statistics: number of edges and nodes in the network, Least-Upper-

Boundedness, coreness, centralization, hierarchy, and connectedness (Krackhardt, 1994).

The first two are simply measures of the number of actors and their relationships, while the others are more complex measures of network properties. Below I describe each of these measures.

1) Least Upper Boundedness - In the ideal hierarchy all pairs of developers have a common “core member” closer to the center of the community. The metric is 1 where all pairs of developers have a common core member working with them, and 0 where no two individuals work with the same core member. 2) Coreness describes to which degree the network forms large components. Technically, the coreness of a graph is the maximal number k subgraphs, where each node has a degree of at least k. Substantively, this measures the degree to which the graph has a large core at the center. 3) Centralization is calculated by first calculating the network centralization of each developer in the network, then looking at the inequality of centralization across the network. If one or a few developer have a very high level of centralization whereas others have a low level of centralization we can say that the entire network is highly centralized, and vice versa.

Hence, this measure gives an indication of how unevenly interaction is distributed across 135

the developers within the community. E.g., if a small group of developers are highly involved with each other, while the rest of the network has few interactions with the smaller group of developers as well as each other, then network centralization can be expected to be high, due to the unevenness of centralization across the network

(Wasserman & Faust, 1994). 4) Hierarchy measures the degree to which a network is a perfect hierarchy (asymmetric), meaning that all developers are connected to the core through intermediate layers, and that no developer ‘skips’ these intermediate layers, e.g. direct connections from the outer periphery to the core are absent. This means that a core member can reach out towards anyone in the periphery through established network ties, but the peripheral members do not necessarily have direct ties to core members. The metric is 0 if all relationships are symmetric, and 1 for a fully asymmetric network. 5)

Connectedness measures the degree to which everyone is connected. If everyone is connected the score is 1. A set of nodes without edges has a connectedness score of 0.

This metric can help us to understand whether we are talking about a single network, or a group of non-connected graphs, and can therefore help us identify subgroups within the community that work in isolation on various parts of the code.

Findings I report the findings as follows: first I detail the overall pattern of how routine heterogeneity changes over time. Then I report findings from the cluster analysis and provide qualitative and computational details on the routine types represented by each cluster. The distribution of routine performances across these two routine types across time is plotted next. To explain why the routine structures have changed the way they have, I detail the contours of the feature development discourse which engenders routine

136

heterogeneity. Further, to validate that this mechanism accounts for increasing intensity concurrently with feature-rich and controversial releases, I extract social network statistics to show how during more heterogeneous periods a higher degree of interaction across the community takes place.

Patterns of Routine Heterogeneity As we can see in Figure 11 below, the overall routine heterogeneity increases, on average, across the period that I have studied. Make notice of a local peak around the 3.1 release, a notable decrease in heterogeneity around the 3.2 release and an increasing climb towards the 4.0 beta in February 2013.

Figure 11. Routine Heterogeneity across time

Cluster Analysis An evaluation of the possible cluster solutions is shown in Figure 12. Explanatory power is evaluated by the fit statistics PBC, HG, and ASW. We can see a distinct drop in explanatory power after two clusters. Consequently, I choose a two-cluster solution. Note that this is different from the four-cluster solution arrived at in Study #1. However, the two clusters that emerged from the clustering here do capture the main forms of problem solving outlined in Study #1: discourse-driven and direct problem-solving. Hence,

137

continuity with regards to central pieces of empirical evidence is maintained across studies #1 and #2.

Figure 12. Cluster Fit Statistics In Table 21 we can see some descriptive statistics with regards to each of the clusters. In terms of code, discourse-driven problem solving deals with code that is larger and more substantially varied. However, below 20 % of that code is merged, while direct problem solving merges almost 80% of the code attached to each pull request. Further, discourse- driven problem solving is at 0.10 the more heterogeneous routine (compared to 0.05 for direct problem solving). Similarly, discourse-driven problem solving has more participants (1.25 vs. 1.14 on average), longer duration (24.83 days vs. 6.40 days on average) and a larger number of activities within each pull request (12.37 vs. 7.61 on average). Last, discourse-driven problem solving represents 66.63 % of all routine performances, while direct problem solving represents 33.36 % of all routine performances.

138

Table 21. Summary of Cluster Characteristics

Discourse-driven problem Direct problem solving solving Code Metrics Average LOC added (stdev) 136.41 (684.62) 99.48 (947.08) Average LOC deleted (stdev) 151.85 (1147.60) 37.93 (441.26) Average commits attached to pull 18.43 (140.80) 6.66 (65.33) requests (stdev) Average files touched by pull requests 13.05 (72.66) 5.66 (48.18) (stdev) % Merged 18.50% 79.24% Routine Characteristics Heterogeneity (stdev) 0.10 (0.12) 0.05 (0.03) Average # of Participants (sd) 1.25 (0.54) 1.14 (0.43) Average Duration in days (stdev) 24.83 (82.10) 6.40 (30.96) Average number of activities (stdev) 12.37 (14.06) 7.61 (4.07) N (% of total) 789 (66.63%) 395 (33.36%) The clusters are plotted in Figure 13. Each identified activity sequence is plotted as a single point in a Euclidean space where the distance between two dots represents the

OM-distance between two activity sequences. It is important to note that because it is a

Euclidean space, the X- and Y-axes do not have direct interpretations. Rather, the graph should be seen as a type of coordinate system, which allows the reader to visually estimate distances between observations (i.e. distinct routine enactments). Hence, the overall spread of dots within a single cluster represents the overall detected heterogeneity within that cluster. For example, we can see that direct problem solving cluster is largely concentrated in an area close to the lower right-hand corner of the Euclidean space, whereas the discourse-driven problem solving cluster is more or less well distributed across a large portion of the left side of the Euclidean space. This indicates the greater degree of heterogeneity within the discourse-driven problem solving cluster.

139

Figure 13. Visualization of Routine Clusters

Zooming in on Routine Structures In Table 22 I depict the frequencies of the various activity types within each of the two clusters. The first cluster is dominated by the ‘discussed’ activity (52.12%), which means that it contains large amounts of discussion in the form of comments on pull requests and attached code. This indicates that these routine performances contain strong elements of discourse – speech acts utilized to shape artifacts.

The second cluster has a less extreme activity structure. However, the most telling fact is that the ‘merged’ activity category stands for a large portion of the activities (19.02%).

This indicates that the routine performances within this cluster lead to code being merged into the baseline codebase to a high degree. Additionally, the amount of discussion that occurs is low at 12.34%, thereby indicating that the code being proposed is largely uncontroversial. To better understand what each of the routines are constituted by, I computationally extracted statistically representative sequences for each of the clusters, which I detail below.

140

Table 22. Activity Frequencies

Discourse-drive Direct problem Activity Definition problem solving solving Type Freq. % Freq. % assigned An issue is assigned to a specific 72 0.74 15 0.17 developer closed An issue is closed 1,062 10.88 723 24.04 discussed A discussion comment is made 5,086 52.12 371 12.34 mentioned A specific commit is mentioned 1,954 20.02 1769 5.62 in a discussion comment merged A pull request is merged into the 235 2.41 572 19.02 baseline copy of the code opened An issue is opened/initiated 391 4.01 363 12.07 referenced A pull request is referenced in 603 6.18 6326 20.82 another pull request reopened A closed issue is reopened 61 0.63 17 0.23 reviewed A specific snippet of code is 294 3.01 171 5.69 commented upon Total 9,758 100.00 3,007 100.00 The routine performances within the cluster I have labeled discourse-driven problem solving are distinguished by long (on average 24.83 days in duration, and 12.37 distinct activities) and heterogeneous (0.10) deliberations. These routine performances often concern themselves with code suggestions which, on average, are more extensive, compared to direct problem solving, across all the measures I extracted (LOC added & deleted, commits, and files touched). Through discursive interplay, various features are negotiated and a commonly agreed upon material form is arrived at. A representative example of a routine performance of this cluster is displayed below:

the500: This problem was found in Rails 2.3.5,...The Rails default Logger impl is not thread-safe...Here is a test to prove it... 1 comment john: nasty one. Do you have a thread-safe silence implemented? …(3 comments) rishab: What's happening with this issue. Can we close this ? the500 closed this pull request

141

the_mittani: ...this still seems to be an issue... miguel reopened this pull request johanberg: We're deprecating silence because of this, see #7643. johanberg closed this pull request zeynab referenced this pull request In this routine performance a developer identifies a bug with regards to thread-safety – the ability of code to utilize concurrency and multiple processors. After a while the issue is solved in a different routine performance, and this particular code suggestion (pull request) is closed down.

Table 23. Repertoire of Activities – Discourse-driven problem solving

Activity Description Quote “The similarity of Relation#uniq to Array#uniq is confusing. Initial reporting of a bug or Reporting Since our Relation API is close to SQL terms I renamed feature request #uniq to #distinct.“ Inquiring into Code “@DBRB Removing :uniq in favor of :distinct causes a build Examining the problem for activerecord- deprecated_finders (see Probing consequences of the rails/activerecord-deprecated_finders#8). internal suggested changes for other artifacts parts of the Rails codebase Do you know if that should be fixed in AR- deprecated_finders or in Rails itself?” “I'm the maintainer of the jquery-ui-rails gem, and I just spent 30 minutes tracking this down because somebody had reported an issue on jquery-ui-rails -- only to discover that Examining the Probing it's a weird problem with how Rails requires you to explicitly consequences of the external set the environment. suggested changes for artifacts packages external to Rails Because of that, I'd like to second the request that Rails at least handle this more gracefully. It seems it would cause less work for all of us.” Examining the consequences of the Probing suggested changes for “Overall it looks good, just afraid that people could be using use cases common or desired use uniq_value directly, and that could break?“ cases within the Rails community

142

Aligning Code “I can update the PR to address this, but we will have to first agree on which of the followings is the "right" fix: 1. " pluralize is supposed to take phrases": Fix the inflector rules to drop all usage of /^.../i and replace them with a lookahead of something (because we already know \b doesn't work). Resolving conflicts between 2. " pluralize is supposed to take phrases, but inflector Resolving suggested changes and rules are supposed to operate on a single word": Then conflicts internal/external artifacts, change apply_inflection to match the rules on the last word and/or use cases in the input. 3. " pluralize is supposed to take a single word only": Then change tableize , update the docs and drop the unnecessary tests. 4."Admit that the inflector is broken/inconsistent, but is too big to fail at this point": Do nothing.” “jesusdemaria: .response ? Shouldn't this be .request ? Improving the suggested johanberg: Ugh, I think so, yes. Brains are a funny thing. I'll Refining code through iterative fix that now. features refinement davidnorberg: Argh! Sorry! Obviously hadn't had enough coffee this morning. Just noticed another one...” Improving Making changes to “I updated the PR, assuming option 2 is the way to go. I document documentation, tests, or didn't squash them because I think they deserve to each live ation version control messages inside their own commit to make things more traceable.” Making Decisions about Code “jesusdemaria merged a commit Merging attached code into Merging jesusdemaria: Thanks the codebase rhavas: Thanks.“ “Encode will just serialize ryan object into some appropriate representation and passe it to the post method as a param Rejecting code as not like you just have submitted a form. But since this designed Rejecting suitable to be merged into to invoke REST methods you can't use put for creating a new the codebase object and if you don't send a body it means that you don't want to send a body so encode is not needed here in the put method. So I think this ticket should be closed.“

As we can see in Table 23 above the routine itself consists of a number of different types of activities, arranged in three sequential, but iterative, stages: inquiring into code, aligning code, and making decisions about code. After a problem has been reported, an inquiry stage is initiated, and the code is then probed to test which internal

143

interdependencies are active, as well as how interdependencies with user behavior and external artifacts may be affected by the proposed solution.

In the aligning stage, many different discursive moves are common. Most important is probably the work that developers conduct to resolve any conflicts that has been uncovered between the suggested code changes and internal/external artifacts and/or prevalent use cases. Refining features are responses to demands on the substantive functionalities of the code. At times certain features seem to be missing from a code contribution, and in these cases core developers often ask the person who submitted the pull request to amend the code to include additional features. Improving documentation are responses to demands to make sure the code comes with appropriate documentation, e.g. appropriately styled commit messages which make the history of code revisions visible. In the last stage, decisions are made with regards to the code that has been generated. This leads either to rejecting or merging the code, or to further iterations of inquiring and aligning code.

In contrast, the routine performances within the cluster that I have labeled direct problem solving often contain only the minimum necessary activities to move from suggested code to merging of the code into the baseline version of the codebase. These routines are generally shorter (on average 6.40 days in duration, and 7.61 distinct activities) and less heterogeneous (0.05) compared to discourse-driven problem solving. Further, across all the measures of code I extracted, these routine performances also seem to be less complex or extensive. Hence, these routine performances represent fairly straightforward solutions to well-defined problems. A representative example of a routine performance of this cluster is displayed below:

144

“Superflight: This is a good way to ensure (and show) that the Rails gem dependencies are up-to-date with the latest versions. Superflight added a commit al closed this pull request al merged this pull request” In this routine performance code pertaining to a specific problem is suggested, and is merged into the baseline version of the codebase without any form of explicit discursive shaping.

Overall, we can see that discourse-driven problem solving has a heterogeneous structure constituted by the discursive shaping that occurs within such routine performances, whereas the structure of direct-problem solving is substantially simpler, owing to the non-controversial and incremental nature of the problems and solutions dealt with therein.

Table 24. Repertoire of Activities – Direct problem solving

Activity Description Quote Reporting Initial reporting of a bug “Defines behavior for cases described in #2586. Without the or feature request patch the behavior is different on Ruby 1.8 and 1.9:

1.8: ArgumentError: interning empty string 1.9: TypeError: can't convert nil into String

... which is not really helpful. ActiveModel::Name can be used with anonymous classes only when a name argument is given.” Writing Attaching code written to “Fixed error with 'rails generate new plugin' where the code solve a particular issue .gitignore was not properly generated if --dummy-path was used and added test case

This is in regards to Issue #3550” Merging Merging attached code “chapati merged commit into masters” into the codebase Rejecting Rejecting code as “I handle this in other way in #3932 unsuitable to be merged pedroparedes closed this on Dec 11, 2011” into the codebase

145

As we can see in Table 24, the routine itself consists of a number of different types of activities, which developers combine in various ways. These are: reporting, writing code, merging, and rejecting. While none of these activities in themselves are unique vis-à-vis discourse-driven problem solving, the routine as a whole is unique in the sense that it focuses on writing incremental code for uncontroversial features and bugs which can be addressed without the need to resort to discourse amongst developers.

Temporal Cluster Distributions

To determine temporal distribution of routine performances across the two clusters I plot the distribution of such performances in Figure 14. Here, I want to highlight three patterns that occur around each of the three releases. If one considers a 6 month-period with each release at its center, we can characterize the distribution of routine performances across discourse-driven and direct problem solving in the following ways:

3.1 exhibits a balanced pattern, meaning that there are roughly equal amounts of discursive and direct problem solving (61 direct- and 57 discourse-driven problem solving routine performances). 3.2 exhibits a direct problem solving-dominant pattern, indicating that the features in this release require less discursive forms of problem- solving (102 direct- and 68 discourse-driven problem solving routine performances).

Last, the 4.0 Beta exhibits a discourse-driven problem solving-dominant pattern, implying that the features in this release require larger amounts of discursive problem solving (84 direct- and 100 discourse-driven problem solving routine performances).

146

Figure 14. Distribution of routine clusters across time

Zooming out to Examine how Discourse Shapes Software Artifacts What is it then that drives the varying distributions of discourse-driven and direct problem solving over time? To determine a coherent set of requirements and the implement such requirements, Rails developers participate in what I call discursive shaping processes unfolding across multiple arenas, such as conferences, blogs, mailing lists, IRC, and Github pull requests. Effectively, this means that major issues (such as which major technologies should be implemented) are debated across multiple arenas.

For example, one of the most prominently discussed features leading up to the 4.0 release was ‘turbolinks’ – a way of speeding up links within different parts of a web application.

For example, in the blog post below, a core developer voices his concerns with regards to the new, controversial turbolinks feature implemented in the 4.0 release:

“I made a few snarky comments about Turbolinks recently and figured I should write down my thoughts more clearly…Like any other solution, it comes with some caveats… That said, for applications that are willing to carefully think through the requirements of Turbolinks, this solution does provide a nice 147

transitional way to keep building applications without a lot of architectural changes with improved speed.” Others also argued for the benefits of Turbolinks. For example, one developer conduct a number of speed tests on his blog (the post was titled “Seriously. Numbers. Use Them.”) comparing various Rails application with and without Turbolinks. He concluded:

“TL;DR [Too Long; Didn’t Read] Turbolinks seems to speed up apps.” However, not all developers seemed to concur with the idea that Turbolinks was a good idea to implement in Rails. One developer expresses his frustration as follows: “Let me tell you why

Turbolinks frustrates me: No one I respect in Rails-land thinks it’s a good idea, but DHH is intent on jamming it in.” Eventually, Turbolinks was implemented in a way that made it optional, a choice which according to some developers decreased the amount of controversy surrounding the feature: “turning it off is no big deal, so the controversy is

(again) unfounded.”

All in all, artifacts are shaped in discursive form – meaning that collective, cognitive representations of the artifact distributed across the developer community are being shaped in a dialectical fashion by various discursive moves. This sets the overall stage for the formation of a specific material artifact and its implementation.

Variation in the Intensity of Discursive Shaping across Releases During the studied time period I noted three distinct releases, two minor (3.1 and 3.2) and a major (4.0) release. These releases vary quite widely in terms of their characteristics, and notably in the extent of new features being implemented. In the 3.1 release the extent of new features is considered to be medium. This can be inferred from the number of contributors (developers who submitted code which was merged into the codebase), files changed since last release, as well as the number of commits since the last release. Each

148

of these indicate the extent of changes to the codebase. For example, the number of files changed indicates how widespread the changes were across the codebase, with higher numbers indicating that a larger fraction of the codebase was impacted by the changes.

For the 3.1 release all of these metrics are lower than the 4.0 release, but higher than the

3.2 release (see Table 25). This is also exemplified by a number of controversial features being implemented, such as Coffeescript – a framework intended to simplify the

Javascript syntax. In a conference keynote, DHH5, the founder of Rails, made the following remark: “If there is one piece of progress…that I've seen the most objections to, it's probably Coffeescript.” Hence, issues such as Coffeescript were debated, i.e. discursively shaped, across the community, before they were actually implemented as code artifacts. In addition to such controversial topic, much of the work during this release consisted of cleaning up bugs introduced by the previous 3.0 release.

The 3.2 release was generally considered to be an uncontroversial release. This is also illustrated by the lower rate of changes in this release, as can be seen by the fact that the number of contributors, files changed since last release as well as the number of commits since the last release are the lowest of all three releases (see Table 25). The changes that were implemented with this release were mostly incremental and many of them related to speed increases in features which were already implemented (as stated in the release notes: “Rails 3.2 comes with a development mode that's noticeably faster”). Therefore, this release seemed to be associated with lower degrees of discursive shaping.

5 In my assessment, ‘DHH’ is a public character to such a great extent that it does not make sense to cover up his identity. His role as the de facto leader of the Rails community is undisputed, and need to be recognized as such.

149

Last, we can see that the extent of change in the 4.0 release is much larger than the previous releases. For example, as we can see in Table 25 this release consists of 10,000 commits, whereas the previous releases consisted of 6,090 and 4,194 commits respectively. Similarly, it features changes to 2,351 files, whereas the two previous releases incurred changes to 1,513 and 1,183 files respectively. Further, observers within the community generally perceived the changes to be major: "There is tons and tons of stuff in Rails 4 and there is no way that I can possibly explain all..." Not only were there a large amount of features, many of them major, but many of these features were also deeply controversial, and therefore hotly debated. For example, in a post labeled

“Dangers of Turbolinks” a developer cautions other developers:

“Remember: with Turbolinks you might easily fix surface issues. But it’s an iceberg so think twice. You have to write your code in a very special way to work with that. You are likely to have issues with legacy code. And you get almost nothing for that.” Such discourse unfolded across blogs, conference presentations, and the mailing list, effectively contributing to the way in which the implementation of the Turbolinks feature was done. All in all, what this entails is that the 4.0 release entailed an amount of discursive shaping exceeding that of each of the preceding releases. Note that the patterns of LOC added and deleted do not follow this pattern. However, larger changes in LOC are often related to tests, documentation (e.g. translations) and other changes which do not necessarily represent major features being implemented.

150

Table 25. Properties of Releases

Release Extent of Discursive Contribut Files Commits LOC LOC new shaping ors change since last added deleted Features d since release since last since last last release release release 3.1 Medium Medium 522 1,513 6,090 9,463 7,926 3.2 Small Small 469 1,183 4,194 13,753 10,878 4.0 Large Large 1060 2,351 10,000 11,512 13,399 Thus, it seems that the higher degrees of routine heterogeneity observed (and implicitly, a

higher ratio of discourse-driven to direct problem solving) are concurrent with the

development of a larger array of major features – thus necessitating larger amounts of

discursive shaping. Given that determining which major features should be implemented

and how requires a discursive shaping process, I argue that the advent of a feature-rich

release exerts a teleological pull on the overall routine structure.

Patterns of interaction across the community A central component of the mechanism which generates routine heterogeneity is the

attention to issues, concerns, arguments, and perspectives. This implies that multiple

points of views are contrasted and combined across the different pull requests which

eventually amounts to each release. The faster this mechanism revolves, meaning the

higher the degree to which the community is socially integrated, the more routine

heterogeneity we are going to see. To validate this notion, I look at a number of social

network characteristics of the Github pull request collaborations across three 6-month

periods, each centered on the release date for each of the releases.

151

Table 26. Social Network Statistics

Measure 3.1 3.2 4.0 Beta Edges 20,904 19,793 28,290 Nodes6 1,917 2,037 2,274 LUB-ness 0.13 0.20 0.13 Coreness 14.54 13.33 16.65 Centralization 0.72 0.68 0.96 Hierarchy 0.32 0.38 0.32 Connectedness 0.91 0.89 0.95 In Table 26 above we can essentially see two trends. On the one hand, measures which indicate the degree to which the network has a standardized hierarchical structure (LUB- ness and hierarchy) are generally lower in the 3.1 and 4.0 Beta releases compared to the

3.2 release. On the other hand, measures which indicate the degree to which a network consists of dense connections across large parts of the network (centralization, coreness, and connectedness) are generally higher in the 3.1 and 4.0 Beta releases compared to the

3.2 release.

This indicates, as previously argued, that in the 3.1 and 4.0 Beta releases, developers connect to each other to a greater degree, thus validating the idea that communities are more interactive in discourse-dominated stages, and less interactive and more hierarchical during direct problem solving. In essence, what is happening is that discursive shaping intensifies as developers work on a release containing major features. As discursive

6 Note that ‘nodes’ here denotes all developers who performed any activity in relation to a pull request, whereas the term ‘contributor’ used in Table 7 only refers to those who directly contributed code.

152

shaping intensifies and more interaction occurs across the community, routine heterogeneity is increased so as to leverage requisite variety and negotiate decisions and solutions.

Theorizing: Towards an OSS Lifecycle Model

After capturing the overall temporal patterns of routine heterogeneity, analyzing its constituent parts, as well as extracting some emergent qualitative findings with regards to the possible generative mechanisms, it is now possible to develop a tentative OSS lifecycle model. I do this by first examining how the two routine structures I uncovered form a cohesive system, and then explaining how temporal distributions of these two forms of routines are shaped across the OSS release lifecycle. This enables us to propose some basic elements of a lifecycle model of OSS development.

Figure 15. Discourse-driven & Direct Problem Solving

153

In Figure 15 above we can see how the two routine structures, discourse-driven and direct problem solving form a cohesive system. Jointly, these routines enable developers to work on both incremental problems, essentially through utilizing the logic of “open superposition” (Howison & Crowston, 2014) to solve problems directly, as well as address controversial problems by utilizing discourse-driven problem solving (Scacchi,

2009; Winograd, 1987). Both routine structures start with reporting problems: various bugs and feature requests, after which controversial problems enter a discourse-driven problem solving routine, and less controversial problems enter a direct problem solving routine. In the former a process of inquiring into and aligning code is initiated, through which features are negotiated so as to satisfy constraints imposed by internal/external artifacts as well as common use cases. In the latter, direct-problem solving, code can often be written a simple and incremental manner, solving one problem at a time. After these problem solving processes have concluded, a decision is made whether to reject the solution, or merge it into the codebase.

As I observed in the findings, the relative distributions of performances across these two kinds of routines, discourse-driven and direct problem solving, vary depending on discursive processes and community interactions unfolding across the multiple arenas of the community. By characterizing the distinct patterns that these temporally changing distributions of routine performances constitute, I construct an OSS lifecycle model.

154

Table 27. OSS Lifecycle Model

Stage Cleanup Sedimentation Negotiation Description Post-release cleaning and Incremental Interaction, conflict & prioritization superpositioning of code decision-making Routine Balanced Direct-Dominant Discourse-Dominant Distribution Discursive Shaping Some controversy Uncontroversial Controversial Community Medium Low High Interaction Routine Medium Low High Heterogeneity Modeling interaction and heterogeneity

Community Interaction

Routine Heterogeneity

Flow across releases

As we can see in Table 27 above there are three qualitatively distinct stages in the overall lifecycle: cleanup, sedimentation, and negotiation. The cleanup stage consists of fixing a large amount of bugs that a new release exposes, but also of prioritizing new features to de developed. This leads to an overall discursive shaping process which has some degree of controversy. To respond to this moderate level of controversy, routine heterogeneity is raised to a medium level. The next stage is labeled sedimentation, and consists of

155

distributed, relatively independent work on incremental problems (Howison & Crowston,

2014). This reflects a situation where there is little controversy with regards to the features that are being implemented – therefore necessitating little discursive shaping.

This leads to lower degrees of attention to concerns and arguments, lower community interaction, and therefore lower degrees of routine heterogeneity. Last, in the negotiation stage the controversial nature of features being implemented leads to intensive forms of discursive shaping and high degrees of attention to concerns and arguments distributed across the community. To address such issues a wider structural variety, or heterogeneity, in development routines is necessitated.

When new features are discursively shaped across multiple arenas, different arguments are compared, contrasted, pitted against each other, synthesized etc. Hence, the discursive shaping of artifacts generates community interaction in the form of routine heterogeneity

– the actual practices through which developers attend to concerns and arguments and stabilize the resulting material artifacts for implementation. Therefore, heterogeneity constitutes a coping mechanism for translating a set of divergent requirements into an actual feature set.

Note that direct problem solving is representative of straightforward development work based on already agreed upon discursive artifacts, and as such does not require great degrees of heterogeneity. Direct problem solving addresses straightforward issues with the software. Discourse-driven problem solving, on the other hand, is more open-ended and involves the navigation of a variety of perspectives, concerns, and approaches to handling the issues that cannot be addressed easily and directly.

156

Discussion In this exploratory study, I found that routine heterogeneity increases concurrently with feature-rich releases, and that this increase in heterogeneity is generated by increased interaction in the community due to a discursive mechanism which revolves at increasing speeds as more controversial and substantial features are being shaped. Below I will discuss the theoretical and practical contributions. The theoretical contributions consist of an increasing understanding of routine heterogeneity and lifecycles in OSS, and the practical implications suggest various ways in which core developers can manage requirements discourse.

This study makes three major theoretical contributions: it proposes an OSS lifecycle model, explains the nature of routine heterogeneity, and shows the performance implications of routine heterogeneity. I will describe these contributions below. First, as the project moves from releases with smaller features to releases with more substantial new features, a distinct shift towards discourse-driven problem solving can be observed.

This reveals an important lifecycle structure. Initially, less controversial changes are dealt with effectively through direct problem solving. However, as a feature-rich release approaches, more discursive shaping is needed to negotiate which features should actually be implemented. This represents an important deviation from the previously theorized logic of “open superposition” (Howison & Crowston, 2014) and highlights the ebbs and flows of discourse throughout the release cycle.

Second, routine heterogeneity is not simply ‘variation’ or ‘volatility’ around a standardized routine, as suggested by the open superposition view of open source development (Howison & Crowston, 2014). Rather routine heterogeneity is generated by

157

qualitatively different routines which perform distinct functions. Further, each of these routines vary in their internal structure. The shift to more heterogeneous routines constitutes the overall increase in heterogeneity concurrent with feature-rich releases.

Taken as a whole, the routine heterogeneity of a certain lifecycle stage is a coping mechanism for dealing with complex social or technical problems – it essentially represents an increasing capacity to translate various forms of discourse into material artifacts.

Third, routine heterogeneity is not, as suggested by some accounts, necessarily detrimental to performance (Salvato, 2009). Rather, routine heterogeneity is an adaptive response to ambiguous requirements which can only be determined by social discourse.

As such, routine heterogeneity is driven by a discourse (Scacchi, 2009; Winograd, 1987) which aims to establish requirements through negotiating various concerns and arguments. This discourse leads to increased attention to concerns and arguments, essentially multiple viewpoints, which become increasingly interconnected as discourse unfolds. Such increasing complexity of the overall discursive structure serves to increase the overall heterogeneity of routine structures, and therefore also their capacity to cope with emergent, ambiguous, conflicting, and interlaced requirements.

On a practical level, my study has implications for development of tools supporting OSS work. Possibly, current accounts of OSS, such as “open superposition” (Howison &

Crowston, 2014), are artifacts of contemporary version control systems such as

Sourceforge and Github. These systems focus on the technical aspects of software engineering, but have less sophisticated affordances for stimulating debate, discourse, and negotiation. My study illustrate the need to further investigate how various systems

158

for supporting discourse could be effectively leveraged within the context of OSS – especially to support work with regards to complex features which may be considered controversial.

In conclusion: in this study I have provided indications of an OSS lifecycle, how it is structured, and which mechanisms shape its ebbs and flows. The dominance of the

Bazaar metaphor in OSS has obscured its dynamic nature, and through this study I show how routine structures moves adaptively across time together with the intensity of

discourse.

159

Study #3: The Variation of Routines as Responses to

Organizational and Technical Conditions

Open Source Software (OSS) is developed by distributed and autonomous developers working on a shared technical platform (Crowston et al., 2012). Scholars studying OSS have mainly focused on delineating the characteristics by which OSS differs from other forms of software development, including socialization (Qureshi & Fang, 2010), emergent governance (O’Mahony & Ferraro, 2007), network structures (Singh et al.,

2011), or how independent, solitary contributions contribute to software artifacts

(Howison & Crowston, 2014). The focus on contrasting OSS development to other forms of software development rests on the implicit assumption that OSS is a homogenous approach to software development. Yet, upon closer examination, such assumption appears unrealistic. Indeed, extant recent findings do not support this assumption: there is substantial variation in the structural features of OSS projects (Crowston et al., 2012) – across both social (such as developer networks, e.g. Mockus et al., 2002) and technical structures (such as growth in codebases, Darcy et al., 2010). This variation drives a number of OSS performance metrics such as the popularity or quality of code (e.g. Daniel et al., 2013). Given these observations, coupled with the insight that contemporary OSS projects cover every imaginable kind of software and a variety of novel social arrangements (von Hippel & von Krogh 2003; Fitzgerald 2006), the simplistic metaphor of a bazaar or a cathedral (Raymond 2001) has probably run its course. More detailed accounts of why and how OSS development routines vary are in order. In other words,

160

what explains development routine heterogeneity within a distributed community

(Feldman & Pentland, 2003)?

In this paper I advance a relatively simple argument: routine heterogeneity – the diversity of routine structures present within an OSS project – represents the de facto capacity to handle the project’s information processing needs expressed in streams of informal requirements, feature requests, and bugs. By having the capacity to generate heterogeneous routines, OSS projects create the ‘requisite variety’ to deal with the problems they face. Understanding how such requisite variety is generated is crucial because this heterogeneity represents the necessary resources that development communities need to address the problems that they are faced with. Failure to generate the requisite degrees of variety is likely to lead to project failure and community breakdown, and therefore understanding the ways in which such variety is generated is a crucial component in understanding OSS project success.

I craft a systematic explanation of mechanisms which generate varying degrees of routine heterogeneity – and therefore the capacity for higher performance under varying circumstances (Page, 2010). I situate my theorizing in the well-established Information

Processing View (hereafter ‘IPV’) of organizing (Galbraith, 1973, 1974). This view is promising for studying development heterogeneity routines for two reasons. First, the

IPV has become a foundational lens for exploring the emergence of organizational structures for some time (e.g. Eisenhardt et al., 2010), and offers a rich vocabulary and a nuanced tradition to draw upon. Second, and perhaps more importantly, IPV offers a perspective crafted specifically to understand how managers can use varying organizational strategies to deal with the variety of tasks which need to be performed to

161

adapt to varying external pressures. In this regard, the IPV helps us conceptualize two opposing facets of OSS projects: their shifting information processing needs, i.e. the variety of development tasks and their information content, which must be understood, analyzed, and decided upon when actually writing working code, as well as the available information processing strategies, i.e. how writing the code need to be organized and resourced to process adequately information inherent in faced development problems.

To demonstrate the viability of my tentative explanation I conduct a mixed-method, multi-site comparative case study of four OSS projects: Rails, Django, Rubinius, and

Bootstrap. I employ a retroductive inference process (Zachariadis et al., 2013) which uses computational and qualitative inquiry to uncover mechanisms which generate quantitative patterns identified by sequence analysis of routines. I find that observed variation in development routines – conceptualized as the degree of heterogeneity in the structures of routines, can be seen as a response to the material variety of codebases.

Further, distinct forms of rationalities, either discursive or principled, serve as drivers of emergent and latent ‘ostensive’ routines exhibiting varying degrees of heterogeneity.

The rest of the paper is organized as follows. First, I conduct exploratory data mining using sequence analysis to extract development routines and demonstrate their heterogeneity. Next, I review extant literature to identify explanations for observed routine heterogeneity. Failing to find adequate explanations, I next conduct computational analysis of file-interdependence structures and qualitative inquiry into the generative mechanisms behind variation in development routines in OSS projects. Finally

I draw upon IPV to theorize how interactions between forms of rationality and materiality generate routine heterogeneity in OSS projects.

162

Exploratory Data Mining using Sequence Analysis Routines are often geared towards achieving specific organizational goals – such as triaging bugs, inquiring into difficult technical problems, or adding new features.

Routines have also been argued to embody “best practices” (Eisenhardt & Martin, 2000) or organizational memory (Nelson & Winter, 1982). Given these characteristics, routines can help to explain the superior or mediocre performance of specific organized forms of

OSS. However, in the OSS context certain modifications of the contemporary understanding of routines need to be made. Traditionally, routines are conceived of as having a performative (what actually gets done) and an ostensive (what, formally, should be done) aspect (Feldman & Pentland, 2003). In the context of OSS the latter aspect is quite weak (Scacchi, 2001), and the focus lies on the former aspect. Therefore, routines in the context of OSS are emergent from the activities of autonomous developers. This has the following consequence – ostensive routines, to the degree in which they exist, can only be understood as latent patterns across multiple routine performances.

To capture such routine patterns existing latently in digital trace data I utilize a quantitative, inductive method: sequence analysis. This allows us to examine the latent structure and heterogeneity of the routines across the four OSS projects in my study. I theoretically sampled (Eisenhardt, 1989) four different OSS projects, all hosted on the

Github OSS platform (https://github.com/): Rails, Django, Rubinius, and Bootstrap.

These projects were chosen because are similar across multiple dimensions, thus allowing us to control for their influence (Yin, 2008): 1) they are all related to web development,

2) they are well-known and have enjoyed similar degrees of success (English & Schweik,

2007; Wiggins & Crowston, 2010). Further, 3) they are all of medium size (medium

163

indicating that they are among the largest projects on Github, but still smaller than the giants of the OSS world – Linux, Apache, Mozilla, etc.), 4) they have substantive communities of developers involved, 5) are mature by virtue of having multiple official releases, and 6) enjoy some form of corporate involvement.

Rails and Django are full-stack web development frameworks based on the Ruby and

Python programming languages, respectively. Rubinius is a Virtual Machine for the Ruby programming language, which has recently engaged in a push towards implementing concurrency across the codebase to effectively utilize multi-core architectures. Last,

Bootstrap is a frontend web design framework developed originally by two employees at

Twitter. Bootstrap is composed of templates which rely on mature technologies (e.g.

HTML, CSS, Javascript etc.).

Data Collection Because of the online, distributed nature of the projects, I was able to collect rich digital traces of action. The vast majority of work in each OSS project took place on the Github version control system and has been recorded on an event-by-event basis. In order to extract these digital traces a number of scripts were written to query the GHtorrent database (Gousios & Spinellis 2012), which archives digital traces from Github. Analysis of sequences requires a method of arranging all the activities recorded in an archive into smaller groups of related activities, approximating tasks or episodes of work as understood by the participants. In previous work, scholars have undertaken this manually, by inspecting each activity in its temporal context (Howison & Crowston, 2014;

Wagstrom, 2009).

164

Recent improvements in source-code management systems, however, enable participants to group activities themselves, around the concept of a “pull request”. A “pull request” is created when a developer has completed code changes (a ‘patch’) that they wish to have merged into the main repository (i.e., a request to those with decision making power to

‘pull’ these new changes into the main repository). By grouping activities according to which pull request they are associated with I am choosing to examine the project’s routines in discussing and considering which code to include in the main repository. I do not capture all activity in the project (excluding, for example, pre-coding discussions like bug reports and/or feature requests), but I do capture all routine performances directly related to writing code for a project. This approach is necessary for a number of reasons:

First, not all potential activities related to writing code are possible to capture in a homogenous format. For example, code writing activities which happen on personal computers and are not shared publicly are not possible for us to capture, and activities happening on IRC or over mailing lists cannot be consistently integrated into sequences of action pertaining to the same unit of analysis (i.e. a pull request). Second, the Django project utilizes a secondary platform to report bugs. Through restricting my scope to pull requests which have at least 1 commit attached, I ensure standardization of the data across all four projects.

My scripts capture all activities related to pull requests – code patches submitted to a project – in each project (5,624 routines consisting of 49,894 activities), across a single year (6th of January 2012 to 6th of January 2013). The sample sizes are detailed in Table

28 below. These digital traces are residues of routine performances (e.g. Anjewierden &

Efimova, 2006). Hence, activities are made ‘measureable’ through recording traces of

165

activity sequences on Github. For example, a developer may commit code to a local repository on his own laptop, submit it as a pull request for review at Github, which creates a string of comments, and eventually leads to the merging of the code into the main repository. Each specific action that is done in relation to a specific Pull Request

Identification (PID) number leaves contingent sets of digital traces (commit, pull request, comments, and merging), which researchers can extract and analyze.

Table 28. Sample Sizes

Measure Django Rubinius Bootstrap Rails Total N (routines)* 621 279 1,440 3,284 5,624 N (activities)* 3,004 2,620 4,420 39,850 49,894 * digital traces of routines are stored in both human-readable as well as quantitative form.

Data Analysis I analyzed digital traces of routines using sequence analysis (Abbott, 1995). By using such techniques, I could detect the heterogeneity of routines – variation in the types of activities and their ordering – by computing the dissimilarities between all observed activity sequences. This can be accomplished by using Optimal Matching (OM) techniques; these compute the ‘distance’ between pairs of activity sequences by evaluating the ‘cost’ of converting one to the other. This can be illustrated by considering the following two activity sequences:

1: open / comment / merge

2: open / comment / close

Because we are not only measuring types of activities, but also their ordering, we need to compare every position to see what changes are necessary to convert one routine into the other. In this example, positions 1 and 2 are identical, whereas position 3 is different across the two activity sequences. Therefore we need to replace the ‘merge’ activity in 166

sequence 1 with a ‘close’ activity to arrive at routine 2. This simple replacement allows us to quantify the ‘cost’ of conversion, or distance between activity sequences, which in this example, if we assume that the cost for a single conversion is 1, this amounts to a total cost of 1 (Abbott & Hrycak, 1990). Converting activity sequences with more differences result in higher costs and distances.

In order to measure the overall heterogeneity of routines in each project, every activity sequence within each project is converted to every other activity sequence (within the same project) using an OM algorithm. This establishes a matrix where each entry denotes the distance between two activity sequences. The OM algorithm measures the distances between activity sequences in terms of the number of insertions, deletions and substitutions (called indels) necessary to transform one activity sequence into another.

The total distance between any two activity sequences is then called an OM distance and is a measure of dissimilarity – the degree to which two activity sequences are not similar to each other. For example, for a project that exhibits activity sequences that vary a lot

(between simple and rich in terms of activity types and their ordering) the analysis would show high difference between the types of routines. Averaging these differences gives us the overall routine heterogeneity of the project. On the other hand, if a project only had activity sequences consisting of the same set of activities ordered in the same way, these activity sequences would be more similar to each other, and therefore the difference scores would be lower, and when averaged, show the project to be low in routine heterogeneity.

The heterogeneity values for each project were computed as the average of dissimilarities across all pairings of activity sequences within each project. To capture the structure of

167

sequences (i.e. activity types and their ordering) rather than mere differences in length I normalized for the length (number of activities) found in each activity sequence. Due to unequal sample sizes and variances, I used a non-parametric Kruskal-Wallis test to test for statistically significant differences across the projects. I analyzed the sensitivity of these results by also setting the cost of conversion to 2, 1, and finally 0.5. Each of these settings displayed incrementally smaller dissimilarities across projects, but all displayed significant differences and the same overall pattern (the last test, for a cost of 0.5, was significant at chi-squared=1,264,612.00, df=3, p<0.01). I finally settled on the heterogeneity values generated by setting the cost of conversion to 0.5, well beyond standards suggested by previous researchers (Gabadinho, Ritschard, Studer, & Nicolas,

2011; Pentland, 2003). This ensures that the results are not inflated by the choice of computational parameters.

Routine Characteristics The heterogeneity of routines captures the diversity in the overall structure of development activities, which unfold in each project. The heatmaps in Figure 16 below visualizes the distribution of heterogeneity across each project. The number superimposed on each heatmap is the average heterogeneity in each project. Both X- and

Y-axes are activity sequences, grouped by similarity. Darker (more red) colors depict activity sequences that are more similar to each other, lighter color (more yellow) depict those that are less similar. The diagonals are dark (red) because every activity sequence is

100% similar to itself. We can see how the overall color of each heatmap mirrors the heterogeneity values, with Bootstrap and Django being darker (0.41 and 0.47 respectively), while Rails and Rubinius are lighter (0.62 and 0.56 respectively). Further,

168

the particular ordering of activity sequences along the X- and Y-axes is achieved through a bottom-up hierarchical clustering, which clusters activity sequences based on their similarity, starting with the most similar sequences. Large blocks of red therefore represent multiple activity sequences, which are very similar to each other, and therefore could be argued to form coherent types of subroutines. I will, however, in this study not pay further attention to routines within each project, as this has been thoroughly investigated in studies #1 and #2.

Django: 0.47 (0.23) Rubinius: 0.56 (0.16)

Bootstrap: 0.41 (0.25) Rails: 0.62 (0.18)

Figure 16. Patterns of Heterogeneity across & within Projects Now we have established a basic pattern that needs to be explained. Routines vary in their heterogeneity across projects – but what is it that drives such heterogeneity?

169

Explaining Heterogeneity of OSS Development Routines Across Projects The extant literature has attempted to explain the drivers of heterogeneity in routines, processes, and patterns in multiple ways. In this section I will review these explanations to see if they are sufficient to explain the diversity of my results above. I will examine ostensive routines, cognitive focus, digitalization, and evolutionary mechanisms.

Strong ostensive routines have been theorized to decrease heterogeneity. For example,

Six Sigma or Capability Maturity Model (CMM) approaches rely on strong ostensive routines to decrease routine heterogeneity, so as to increase the quality of work conducted

(Schroeder et al., 2008). Minimization of routine heterogeneity is achieved through specifying strict process models for which types of activities should be sequenced in a particular way. Through this mechanism, various workflows are standardized, and the potential for overall routine heterogeneity is minimized. For this explanation to work we would need to find that projects with lower heterogeneity have stronger ostensive routines than those projects with higher heterogeneity, other things being equal. Yet the literature on OSS has argued that strong ostensive routines are not a common feature in this context (Scacchi, 2001).

Further, stabilization of cognitive focus has been used as a mechanism for explaining the reduction of routine heterogeneity of design routines (Salvato, 2009). The proposed mechanism operates through minimizing exploration and maximizing exploitation through using stabilized cognitive focus – strict attention to a limited set of, agreed upon features – as a coordination mechanism. Such stabilization of focus and the resulting homogeneity of routines leads to a greater economy of activities. However, such subjectively constructed and collective states of mind may be difficult to engender and 170

control in distributed contexts such as OSS. Further, the minimization of routine heterogeneity may not even be desirable. On the contrary, some theorists have actively sought to explain the generation of higher degrees of heterogeneity.

Digitalization may drive higher degrees of heterogeneity in various design processes. For example, it has been argued that routine heterogeneity increases as rich digital capabilities are leveraged throughout design processes (Gaskin et al., 2011). The proposed mechanism operates through increased opportunities for varied combinations of design actions afforded by digital tools, thereby increasing the overall heterogeneity of the design routines that unfold. However, since all the OSS work examined above is already digital, this is not a useful explanation for explaining variation in heterogeneity across separate OSS projects (although it might help explain variation across OSS and non-OSS contexts).

Last, evolutionary processes have also been argued to lead to increased heterogeneity, because heterogeneity is adaptive – meaning that higher degrees of heterogeneity (at least in natural systems, but conceivably also in socio-technical systems) leads to higher degrees of robustness, and therefore survivability and performance (Page, 2010). This line of reasoning would lead to the conclusion that heterogeneity in OSS projects is driven by evolutionary processes (Lee & Cole, 2003). However, without identification of specific sources of variation, selection, and retention and their differences across projects, this perspective does not help us to explain differences in heterogeneity.

All in all, previous explanations of heterogeneity are problematic when applied to the

OSS context. They are problematic because they either rely on organizational capabilities that are not available to OSS developers (i.e. specifying strong ostensive routines),

171

mechanisms, which are difficult to influence in virtual contexts (i.e. stabilized levels of cognitive focus), or aspects which are common to all OSS projects (e.g. degree of digitalization or evolutionary mechanisms).

A Comparative Study of OSS Routines: Explaining Routine Heterogeneity Seeing as extant literature does not provide a cohesive picture of what factors may account for routine heterogeneity within the OSS context, I opted to conduct qualitative inquiry into the projects, so as to uncover the generative mechanisms behind observed routine heterogeneity. Themes emerged inductively, through a hermeneutic process

(Boland et al., 2010) of iterative back-and-forth analysis of data on routines (as text) and the interview and archival data (as context) in concert with further grounding in the extant literature. As such, in my analysis I “zoomed in and out” (Gaskin et al., 2014) across low level routine data and an interpretive understanding of its contextual significance (Goggins, Galyen, & Laffey, 2010). By doing so I essentially followed a retroductive inference process associated with critical realism (Zachariadis et al., 2013), which seeks to identify, through qualitative inquiry, the underlying generative mechanisms behind certain observed patterns in quantitative data.

Table 29. Sample Sizes

Measure Django Rubinius Bootstrap Rails Total N (interviews) 12 17 4 13 46 N (public audio) 1 3 1 11 16 N (public video) 17 3 3 10 33 N (public text) 9 8 18 10 45 Table 29 above summarizes my data collection efforts. Interviews were conducted over

Skype, as well as through attendance at industry conferences (e.g. DjangoCon), and in some cases over email when non-English speaking developers (especially Japanese

172

developers within the Ruby community) were uncomfortable speaking in English. These interviews were semi-structured, and focused on each participant’s experience of working on, and actively contributing to, one of the four focal projects. Interviews were dynamically structured based on the form of each interviewees’ actual experience and ranged in duration from 45 minutes to 3 hours and were conducted with founders, corporate sponsors, core team members, regular contributors, as well as peripheral contributors of each project. Common questions were of the type: “Please describe a recent contribution you made – what did you do?” The dataset also includes audio

(podcasts and interviews, ranging in duration from 1 to 2 hours each), video (conference keynotes, panels, talks, and interviews, ranging in duration from 30 minutes to 3 hours each), and text (blog posts, interviews, and documentation). Interviews were transcribed, public audio was already transcribed by external parties, and specific sections of public video were transcribed based on their relevance to my research question.

In total 670 codes were generated, and sorted under categories such as ‘interdependence’,

“principled rationality”, and “routine structures”. As I synthesized my codes into these categories, the ways in which projects responded to degrees of task complexity became clear – through two distinct forms of rationality: principled and discursive. This analysis permitted me to see how forms of rationalities in each OSS project influenced the emergence of varying degrees of routine heterogeneity.

An Information Processing View of OSS Development To furnish a theoretical account of why and how development routines come to vary, I draw upon Galbraith’s (1973, 1974) theory of information processing (IPV). In

Galbraith’s view, organizations face information processing needs of varying degrees,

173

stemming from environmental pressures. The amount of information to be processed is a function of the task complexity entailed by these environmental pressures. To respond to task complexity, managers use organizational structuring mechanisms: hierarchies, goals, and programs (March & Simon, 1958), to create organizational strategies, such as self- contained tasks (to decrease information processing needs), or lateral relations (to increase capacity for processing information). The two central constructs of the IPV – information processing needs and strategies for processing information – can then help us to observe how specific development routines emerge to address specific problems related to a codebase which is materially configured in a specific way.

However, in OSS, information processing needs cannot be argued to be wholly external to a project. Originally, information processing needs stem from various requirements

(Scacchi, 2009), which are captured in bug trackers, roadmaps, blog posts, and conference presentations. These requirements concern changes to a codebase which is configured in a specific way. Therefore, the material configuration of the codebase greatly influences the nature of the development problems which developers must contend with. For example, more complex codebases tend to generate problems which are complex in nature (Jason et al., 2013). Solving such problems requires the performance of a greater variety of tasks. Hence I define information processing needs

(hereafter ‘needs’) as sources of task complexity based on material characteristics of the codebase. The specific characteristics of such needs form inputs to the formation of information processing strategies.

To show how organizations respond to information processing needs, Galbraith posed a number of information processing strategies, such as lateral relationship structures (such

174

as taskforces) and vertical forms (such as information systems), both of which increase information processing capacities. These can be characterized as cross-sectional, or

‘relational’ structures (Lindberg, Gaskin, Berente, Lyytinen, & Yoo, 2013). Software development, however, is typically conceptualized and organized in terms of

‘methodologies’ – essentially procedural, rather than relational, strategies for processing information, such as agile and waterfall approaches (e.g. Vidgen & Wang, 2009). My focus on OSS development routines invites us to extend those aspects of Galbraith’s theory, which can be usefully applied to procedures for creating specific outcomes such as working code. In this context, routines perform the information processing work inherent in software development – i.e. solving technology-related development problems, such as fixing bugs or soliciting and implementing user-requested features.

In order to address problems stemming from a codebase exhibiting certain material characteristics, information processing routines and needs must be aligned to each other.

Ashby’s law of requisite variety (Ashby, 1956), a view central to the ‘fit’ or ‘alignment’ focus of the IPV, implies that an OSS project can adapt successfully to the problems posed by a codebase by matching the task complexity necessitated by problems through aligning routines to manifest similar degrees of variety (Venkatraman & Camillus, 1984;

Venkatraman, 1989). That is, task complexity of needs must be matched by the heterogeneity of routines – variation in activity types and their ordering. The provision of routines with requisite degrees of heterogeneity rests on the assumption that distributed performances of coding practices (Salvato, 2009) become clustered or ‘glued’ together based on shared properties (Liu & Pentland, 2011) that address observed information processing needs. For example, routines with lower degrees of heterogeneity are more

175

likely to deal with the intake, vetting, and triage of bugs, whereas routines with higher degrees of heterogeneity are more likely to be used to write code to fix such bugs

(Lindberg & Berente 2014).

Therefore, developers working on different codebases adopt different information processing strategies which provide varying levels of routine heterogeneity as a response to the task complexity which problems stemming from differently configured codebases pose.

Material Characteristics One of the central claims of this study is that all OSS projects are not the same – they are faced with different information processing needs, manifested in the material characteristics of codebases. Different projects simply present different degrees of complexity, and therefore necessitate that developers fashion different information processing structures to address such needs.

To capture these different information processing needs I used Cataldo, Herbsleb, &

Carley's (2008) approach to computing the degree of interdependence and complexity in projects through looking at files which are being edited together. This data is extracted from the Github API through extracting the names of files that are attached to the same commit. The assumption here is that if two files are being edited together in the same commit, then there is a logical, but not necessarily syntactical (as in functions in different files calling each other), interdependence between these two files. Based on this a graph can be created where the nodes are individual files, and edges are created between files when they are edited together.

176

Once this data had been extracted I applied a walktrap community detection algorithm

(Pons & Latapy, 2005). This algorithm utilizes random walks between across nodes based on paths made up by available edges. Through such random walks communities, or clusters of tightly interconnected files are detected. Hence, the graphical representations of these file-interdependence graphs as well as the descriptive statistics coming from the walktrap community detection graph gives us a sense of the degree to which a file structure is widely varied, or whether it is just a small number of folders with files in them.

Django (walktrap communities = 34, Rubinius (walktrap communities = 52, task complexity = low) task complexity = high)

Bootstrap (walktrap communities = 16, Rails (walktrap communities = 43, task complexity = low) task complexity = high)

Figure 17. Material structures of code bases As you can see in Figure 17 above the concurrent-editing structures of each of the projects differ markedly. Here we can clearly see that, for example, Bootstrap has the

177

simplest structure with only 16 clusters identified, whereas Rubinius has the most complex structure with 52 distinct clusters identified. Note that in all of these projects I have excluded files related to documentation in order to focus on actual interdependencies in executable code.

My argument here is that the task complexity that working with a codebase entails scales with the number of clusters of tightly interconnected files. Such clusters can be understood to represent distinct clusters of functionalities or at least distinct parts of the codebase, which presumably require different approaches.

The reader will also notice that the number of walktrap communities also scales directly with the heterogeneity of these projects. Hence, it seems feasible to argue that routine heterogeneity is an overall response to the variety of material structures that the codebase of a project contains. At this point it could be claimed that I have answered my research question – what drives routine heterogeneity. However, I argue that this connection between the complexity of a codebase and the heterogeneity of routine performances is not automatic, but rather that routine heterogeneity must be actively generated. In traditional, formal organizations such routine heterogeneity would be generated by ostensive routines, but as I have argued above these are largely not available. Therefore, I conducted qualitative inquiry into these projects to understand the mechanisms through which they structure their routine performances, so as to understand how the requisite variety of routine heterogeneity was generated.

Cognitive Characteristics The main finding of my qualitative inquiry is the cognitive characteristics or forms of rationality through which developers approach their tasks. Developers use these to

178

generate the necessary degrees of routine heterogeneity necessary to match faced task complexity. Emergent from the data, I have identified two forms of rationality: discursive and principled rationality. The former are heuristics produced and maintained by discursive processes of social and technical inquiry, whereas the latter are heuristics based on fundamental goals, visions, and design principles governing the identity of a project.

Principled Rationality Principled rationality captures the forms of thinking and cognitive heuristics, which OSS is traditionally thought of as being premised on. These are related to what Simon (1978) referred to as “substantive rationality” and rests on providing simple and direct heuristics for how to evaluate various pieces of code (see Table 30), and especially the effects that they produce. Often such principled rationality is based on commonly accepted standards, such as root artifacts (e.g. the Python programming language forms the basis of the

Django project), external artifacts with which compatibility must be achieved, or centrally governed decisions.

For example, in the Django project, principled rationality is expressed in terms of a formalized and bureaucratic approach. Here, we see a focus on formalizing processes in various types of documentation:

“The documentation of Django itself as a core documentation is fantastic. If you printed it, it’d be way over 1,000 pages…The way the original creators of Django wanted to do things is to explain how it works so that people could easily join the community and contribute, and so we have a guide which is dedicated to contributing.” (Django Developer) In this case, the relatively centralized and bureaucratic governance structure of the

Django project enables leaders within a community to point out a) what kind of features

179

should be implemented, and b) provide clear specifications of appropriate development standards. This increases the capacity of a project to process information through convenient sourcing and evaluation of solutions from the community, while still providing economy of routines through moderate amounts of heterogeneity. Further, in the case of Bootstrap, principled rationality often focuses on aesthetic aspects – a common sense of good contemporary web design provides the guidepost for how to shape the framework:

“Almost everything that’s made its way into Bootstrap is a component or design paradigm that has been basically used in some other place and design development.” (Bootstrap Developer) Importing “component[s] or design paradigm[s]” provides a way for Bootstrap developers to create an external reference point for what types of contributions are desirable. Hence, reference points can be thought of as heuristics, or preferred ways of solving problems. Such heuristics increase information processing capacity by providing moderate amounts of heterogeneity, and are often executed through the exercise of authority:

“utrecht opened a pull request: Added nav-tabs-link-hover-color utrecht added a commit MO commented: I don't see a need for this—nav tabs use the default link colors on hover. MO closed the pull request” Such simple heuristics geared towards evaluating outcomes allow these projects to source a wide array of contributions without complicated mechanisms. For both Django and

Bootstrap these heuristics are driven by centralized governance structures which, through external standards (contemporary web design standards in the case of Bootstrap or established community standards in the case of Django), can be selected and

180

implemented. This form of principled rationality increases the capacity for processing information through providing moderate routine heterogeneity.

Table 30. Development Heuristics

Form of Heuristic Description Quote Rationality Principled Referring to Using root artifacts as a way “I'm happy to adjust the whitespace (per root artifacts to justify a feature PEP 8 I assume), move this into another file (please tell me where), and shorten the commit title. “ Referring to Using external artifacts as a “Note that internet explorer 8- doesn't external way to justify a feature support preventDefault. So you probably artifacts want to do this instead:“ Centrally Making architectural “because these are defined governed decisions using central syncronously, i think if anything i'd architectural decree rather support a commonjs pattern. priorities Thanks for the request though.“ Centrally Making decisions with “refreshing a page should cause governed design regards to functionalities mouseout and remove the element from priorities using central decree the dom? Not crazy about adding an option for this.“ Discursive Social inquiry Inquiring into goals and “Could you explain why you think this is needs of other developers a good change?“ Procedural Formally specifying “This is the procedure we request people specifications appropriate implementation follow when writing specs for making procedures changes to Rubinius“ Work Relating solutions to “This can't be merged in because it interdependence internal constraints within doesn't yet deal with GC lifetimes of constraints the development process objects. We need to figure that out before we can offer this API.“ Debate Arguing, contesting, and “This is really poorly implemented. reconciling conflicting record.errors.add(:password, "is too views weak and common") 1) Errors are added on :password attribute instead of the attribute user wanted to validate 2) Previous implementation was I18n aware. This is not. English message is always added. Nice idea but needs much better implementation in my opinion.“ This basis of principled rationality helps projects draw their heuristics from a single dominant source – some set of principles upon which the project is run. For example, the

Django project is embedded in a strong sense of how code should be written, partially

181

derived from it’s basis in the Python programming language community. The basic ideas of this community are expressed in the “Zen of Python”, a set of aphorisms, which describe the guiding principles for the design of the Python programming language:

“Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested[…]” Aligned with this set of aphorisms, a number of Django developers express the focus on the ‘explicitness’ of Django code, meaning that it is readable, and that functionalities it performs can be directly discerned from reading the code. Developers do not need to assume that hidden interdependencies activate ‘magic’ functionality.

Similarly, another basis of principled rationality is the power that BDFLs (“Benevolent

Dictators For Life”, a term often used for project founders who stay involved throughout the lifetime of the project) exert over decisions made. For example, the Bootstrap project operates under the principle that the BDFLs maintain a tight level of control, therefore making their decisions the principles upon which all other heuristics are based on:

“That’s happened a couple of times where people aren’t either aware of what we’re thinking and that could be for a number of reasons, or they just have an idea and they’re thinking ‘Hey, this would be kind of cool to add to Bootstrap.’ When it’s something that’s very simple that we just don’t want to do, we don’t want to get into, we’ll just straight up tell people and you know end the discussion there. We’ll close the issue or close the pull request immediately and try to be as clear as possible for why we’re doing that without… The trick is to be brief and succinct without being an asshole, and that’s kind of tough on the Internet because when it’s just written text, it’s very easy to come off as a jerk, you know, and the idea is that we’re just kind of reiterating the same thing over and over again for a lot of those kind of situations where ‘Okay, this isn’t a direction we want to take Bootstrap. Here’s why. We’re gonna close this issue,’ or ‘Hey, this has come up before. See this other issue,’ and we could reference that and say ‘Go there to see why we’re not doing that,’ and stuff like that.” (Bootstrap Developer)

182

Through making such decisions, the BDFLs make their practical vision known throughout the community. This aligns the behaviors and contributions which developers make:

“He [a core developer] was helping me out with how he thinks it should work and how he thinks it should operate. That’s when he told me to take that ‘in’ clause out of the position or placement option, but I fixed it and he was like ‘Oh, take that out. We’ve got to edit this up to fix it so it’ll work’ and this and that, and I mean so he helps out. He will put his two cents in on what should be done or what shouldn’t be done, and as for me the contributor, I would just edit it up so it still works you know how it should work, but then it would match what they [the core developers] would want it to be. (Bootstrap Developer) In sum, the actual heuristics that will be applied when using a principled basis of rationality is determined by fundamental goals, visions, and design principles that lie at the heart of a project. Whether these principles are expressed as BDFL authority or through documented and formalized principles, the effect on routine heterogeneity is the same: heterogeneity is reduced. Usage of such principles both enables and is enabled by lower degrees of variation in file-interdependence structures, which tends to simplify routines. Principled heuristics, in turn, allow for quick access to guidance, therefore leading to the unfolding of less heterogeneous routine structures.

Discursive Rationality Discursive rationality provides heuristics for developers to use throughout the problem- solving process. This form of rationality is similar to what Simon (1978) referred to as

“procedural rationality” and rests on providing heuristics for how to manage a design process as it unfolds sequentially over time (see Table 30 above). Often such heuristics are actively shaped through inquiry, providing procedural specifications, examining task interdependence constraints and straightforward debating.

183

This form of rationality was found to be dominant in two of the projects that I studied –

Rubinius and Rails. In the former project, Rubinius, discursive rationality is expressed in deep discourse with regards to solving problems. For example, in the excerpt below a

Rubinius developer talks about the processes of solving problems and how it is driven by the highly technical nature of the central Rubinius artifacts and how this in turn leads to a focus on the process of solving problems discursively:

“Oh basically if you have something that crashes your working machine, it crashes in different ways each run, or sometimes it doesn’t crash at all. So because of the risk conditions, it’s usually timing related, so it completely depends on all of these uncontrollable factors, what happens. So like how your kernel is scheduling your threads, how your, I don’t know, how your memory behaves in your system, whether certain code or certain objects are in your CPU cache or not and all that kind of stuff comes into play and it’s way too complex. It cannot be consistent over each run, so maybe if you run it one time, it crashes in a certain place in your code. If you run it a second time, it crashes in another place in your code, so it’s something you have to like deal with and try to make sense of.” (Rubinius Developer) At times such sessions happen in isolation, and at other times they are facilitated by discourse between developers, as is illustrated below:

“I’ve done some like screen-sharing session. So basically just share what you’re doing, what you’re working on to see if you can bump ideas back and forth on what the actual problem is. So those are things that sometimes we do, but it’s pretty hard to schedule something that works for both because the business times are different, all that kind of stuff.” (Rubinius Developer) Further, higher-level strategic initiatives, such as which features to implement, also emerge from such discursive processes. In Rails, there is a focus on collating best practices from the community:

“when you’re looking at absorbing best practices…The Rails strategy actually says ‘We’re not gonna wait for everyone to be doing the same thing. We’re gonna be waiting for everyone to be solving the same problem, and as soon as everyone is solving the same problem, we’re gonna look around, see what good ideas there are in the community and we’re gonna try to find a solution that’s best in class for solving that problem,’ and that inevitably means that there’s a certain

184

amount… That’s what people mean when they say Rails is an opinionated framework, is that there’s a certain amount of opinions that has to go into deciding. You don’t say ‘Yes, you need to solve this security problem. Pick your own solution. Find a plug-in.’ You say, we say ‘Everybody’s solving this security problem. Let’s make sure that we solve it,’ and that I think is different. It’s a lot more controversial I think when designing things.” (Rails Developer) Discursive rationality helps projects draw their heuristics from a process of inquiry into social and technical concerns. For example, to establish how contributions should be made, and which contributions should be made, “oceans of regulars” (an expression used by a Rubinius core developer) are encouraged to give their opinions. This is done to attract, in Raymond’s classical parlance (2001), more ‘eyeballs’ thus providing diverse perspectives on typical problems. This is illustrated by the open commit policy in

Rubinius:

“Well one of the things, I really started out with like really small things and what I found is that it was actually people were very welcoming, so like a lot of, like every contribution felt like it was appreciated. There was a lot of trust given to it and it’s just still something we want to, we do today, and that’s what I experienced too when I came in is that like if you have your first patch accepted, you get direct commit access for example.” (Rubinius Developer) This commit policy essentially states that as soon as a developer has had a patch (pull request) accepted to the project, he or she now has commit access to the main repository

– essentially the ability to edit the baseline version of the software. This encourages widespread participation in the process of inquiring into the highly technical nature of the

Rubinius codebase. Further, in the Rubinius project, core developers advocate a specific form of discourse for establishing which features to implement. In an interview, this was described to me in the following way:

“So what I’m proposing in that design process, what I’m saying is, it doesn’t matter who you talk to about it. You have all the conversations you want, but if you want to propose a change to Ruby, you have to implement it, and everybody has to implement it so that everybody can actually see the actual consequences of the proposed future. And at the point that you implement it, once everybody has 185

done so, then you can have a useful conversation about whether or not it ultimately makes sense. Remember, my proposal is that in order to actually propose it, you need to provide the documentation and make the case, which is basically saying ‘This is what we have now (or this is what we don’t have now), and this is what we should have (or shouldn’t have). Here’s how it would work, and this is why I think it’s a good idea.’ If everybody says ‘Okay, you’ve made the prima facie case. You’ve basically shown us, you’ve convinced us that it makes sense to even entertain this idea,’ then before any debate, everyone implements it. Once people have actually implemented it, then the debate is about concrete, relevant things.” (Rubinius Developer) Hence, inquiry unfolds through showing code that illustrates various ideas and solutions to problems, and thereafter developers can discuss which solution is the most suitable.

Similarly, in the Rails project debates with regards to how to achieve various goals are commonplace. A clear example is the implementation of ‘turbolinks’, a feature to speed up the transition between different parts of a web application developed using the Rails framework. During this implementation process, developers diverged markedly in their opinions, and inquiry into what constituted desirable ways of going about achieving such increases in speed were widespread:

“So my little teeny bit of like the way to get around getting in trouble was I wrote a blog post the day Rails came out and said like ‘If you’re complaining about Turbolinks, look at how easy it is to remove. All you need to do is do this,’ and then list out the three steps to actually take it out of your application and left it at that.” (Rails Developer) Here, a developer expresses a dissenting opinion on his private blog, contributing to an overall discourse on how various features should be implemented in the Rails framework.

Such debates can often be acrimonious and resemble ‘drama’:

“Like there is quite a bit of, maybe ‘drama’ is not the right word, like back and forth, like ‘I want this to go in. Somebody else doesn’t want it to go in.’ You kind of do have to fight for your features, but at the end of the day, somebody has to make a call in like a feature is in or out, a pull request is either in or out.” (Rails Developer)

186

Such drama, or discourse, also unfolds across the actual enactments of development routines within each of the projects. For example, below a number of developers manage large disagreements with regards to how code should be fashioned:

“Lychton opened a pull request: Rails projects do not work if project path contains open bracket "[" Lychton commented: Rails projects do not function if they are created within a project path that contains an open bracket "["…. mattell commented: This is quite weird, though I don't see it being that effective to change. Why not just rename the directory? Lychton commented: After spending about 12 hours on the most basic Rails deployment ('hello_world') and almost abandoning Rails altogether, and then filing a clear defect with detailed repro on both Linux and Mac OS X, I confess it's a bit disheartening that the response is to suggest that I rename my directory. Given the bug, that's definitely my only option at the moment. …10 additional comments… stardust was assigned the pull request …12 additional comments… stardust referenced this pull request …3 additional comments… the pull request was referenced 2x …2 additional comments… stardust referenced this pull request johanberg closed the pull request johanberg commented: I think that we're at a stalemate here…If someone can come up with a solution that satisfies everyone in this discussion, we can implement that, but I can't see a path forward here.” In sum, the actual heuristics that will be applied when using a discursive rationality is determined by the outcome of various social and technical inquiry processes. Such inquiry processes are often necessary for establishing heuristics, which are suitable for work on problems stemming from codebases with higher degrees of variation in file- interdependence structures, and therefore tend to increase routine heterogeneity.

187

Theorizing: Explaining Patterns of Routine Heterogeneity As we can see in Table 31 below, the cognitive characteristics of the projects can be used as a way to systematically explain the variation in heterogeneity across the projects. The fundamental logic is this: codebases (materiality) configured in certain ways provide problems which developers must build routine performances to solve. Specific forms of materiality (characterized by the number of walktrap clusters or communities which codebases contain) tends to provide problems of differing task complexity – thus pushing developers to utilize different forms of rationality so as to configure routine performances of different heterogeneity. Below I will explore how these combinations of materiality and rationality leads to the unfolding of routine performances with differing degrees of heterogeneity.

Table 31. Routine Heterogeneity

Projects Bootstrap, Django Rails, Rubinius Material Characteristics Mono-faceted codebases Multi-faceted codebases Cognitive Characteristics Principled Rationality Discursive Rationality Routine Heterogeneity Low High Principled rationality provides capacities for processing information through focusing on heuristics for evaluating the output of design processes vis-à-vis certain goals. Heuristics for evaluating outputs include measures such as centralized decision-making, which apply broadly to all code being written for a certain project. Broadly applicable output heuristics allow projects to easily source and evaluate contributions from a community thereby providing the capacity for processing information.

Less complex material structures renders a principled form of rationality tenable. Because such codebase leads to the emergence of simpler development problems, inquiry is often required to a lesser extent, and principles can provide effective guidance to developers.

188

This means that developers base their activities on the fundamental guiding principles or design values that the project espouses. Because a simpler architecture allows developers to deal with problems without having to consider excessive amounts of interdependencies, longstanding principles can more easily guide developers in making choices between various relatively simple solutions. Clarity with regards to what heuristics should be applied simplifies development processes, and keeps routine heterogeneity relatively low.

However, when file-interdependence structures are more complex, a discursive approach to development is necessary for exploring the various heuristics, which are possible given a material configuration, which often is varied and contains a wealth of qualitatively different functions and interdependencies in between. Through continuously inquiring into the different ways in which interdependent problems can be solved, developers apply a discursive approach to establishing the heuristics that will guide development. Because of the complex processes of dialogue, inquiry, consensus-building, and decision-making that discursive rationality entails, development processes become more complex, and therefore routine heterogeneity becomes relatively high.

In summary, forms of rationality within an OSS project drive routine heterogeneity as the project seeks to address needs emerging from codebases with different degrees of material complexity. Distinct configurations of strategies lead to development routines configured in different ways. Principled heuristics, made possible by less complex codebases, enable developers to quickly and easily apply heuristics (i.e. preferring to follow explicitly formulated procedures rather inquiring into contextually preferable processes), thus enabling simpler, less heterogeneous routines. Discursively shaped

189

heuristics require developers to inquire into social and technical concerns (e.g. through debate and discussion), thus increasing the overall routine heterogeneity required to deal with varied and sometimes ambiguous coding problems.

At this point one may ask, how are such rationalities shaped? The genesis of rationalities are based on processes of organizational imprinting (Johnson, 2007). For example, the

Zen of Python has influenced the Django project since it’s inception, and the corporate beginnings of the Bootstrap project is a potential source of the tight control which its

BDFLs exercise over the project. Rubinius founder established a very liberal commit policy early on during the project, and dynamics within the Rails community illustrate practices derived from it’s founding such as being an ‘omakase’ (curated) project as well as prioritizing convention over configuration (thus leading to debates with regards to what ‘convention’ actually is). Across all four projects we can see how the roots of the project, and the way in which they were originally founded plays a large role in shaping architectural decisions and cognitive heuristics.

As architectural decisions are started to be made, they are made alongside the formation of an emergent form of rationality applied within each project. The current state of the architecture provides a set of problems to be worked upon, and rationalities drive routine performances, which supply a set of solutions that are implemented in an evolving codebase. In my model, the material architecture is captured by the file-interdependence graphs that are created as a codebase is being worked upon. The degree of information processing needs, or task complexity, that a codebase offers up scales with its degree of file-interdependence. The more varied the file inter-dependence structure, the more

190

information processing is involved in the problems that are likely to arise from such a codebase.

Figure 18. Process Model In response to such problems, developers start to shape different forms of rationality – basic attitudes towards structuring cognitive heuristics for how to solve problems at hand.

These rationalities differ in their form. They can either be focused on discursive or principled aspects, and their basis is drawn from either social inquiry or principles, which often is based on early imprinting processes still active within a project. These rationalities are active as basic cognitive attitudes which developers draw upon when enacting routine performances, which result in solutions to the problems provided by codebases. In general, discursive rationality engenders higher degrees of routine heterogeneity compared to principled rationality.

Hence, the continuous supply of problems based on the current architecture

(characterized by its varying file-interdependence structures) and generation of solutions driven by specific forms of rationality (discursive or principled) creates a dynamic stream of routine performances, exhibiting different degrees of heterogeneity (Figure 18).

191

Discussion There is considerable variation in the structures of OSS projects, but due to the extant literature on OSS being focused on delineating how OSS is different from other, more conventional, forms of software development, we lack an understanding of how development routines which vary across OSS projects are grounded in the actual practices of OSS developers. I have argued that routine heterogeneity represents information processing capacity (Page, 2010). Therefore, understanding how such heterogeneity is generated is key to understanding how OSS projects deal with complex development endeavors and avoid failure due to the inability to provide sufficient degrees of variety.

In this study I explain how variation in the heterogeneity of routines emerges as the result of various forms of rationality and heuristics that developers apply throughout the development process. Hence, in responding to needs with distinct levels of task complexity, heuristics guide developers in configuring routines with requisite levels of heterogeneity to successfully adapt to problems posed by differently configured codebases. My theoretical model thus helps us to move beyond structurally determinist accounts of OSS and show how the properties of structures themselves, at least as they unfold over time, are shaped by various rationalities.

Theoretical Contributions I contribute to the literature on OSS, through building upon Howison & Crowston (2014) to forge an account of the ways in which social and technical processes of OSS projects interact so as to shape the multiple development routines observable under the banner of

OSS. I build upon their account of the processes and heuristics that are similar across

192

OSS projects, through showing how material conditions and strategies that differ across

OSS projects are implicated in generating varying development routines. Further, routine heterogeneity is not simply ‘variation’ or ‘volatility’ around a standardized routine, as suggested by open superposition (Howison & Crowston, 2014). Rather, routine heterogeneity is a coping mechanism for dealing with complex technical problems – it essentially represents the capacity to process various forms of development-related information.

I explain how such capacities are generated through expanding the IPV to explicitly consider various procedural strategies – rationalities – for structuring information processing routines as they unfold over time. This renders the IPV more useful for application to fluid, and ‘open’ forms of organizing, such as OSS, within which structuring mechanisms such as hierarchies often are difficult to apply successfully

(Tuertscher, Garud, & Kumaraswamy, 2014), but where the dynamic unfolding of routines may be helpful in understanding how coordination is achieved. Hence, I provide conceptual tools for understanding a wider range of new organizing forms, utilizing

‘open’ principles (Puranam, Alexy, & Reitzig, 2014). As organizations increasingly embrace virtual forms of collaboration exhibiting open principles, such forms of organizing will gradually become more important, and therefore providing knowledge about how these, seemingly uncoordinated organizations, can be influenced is important to scholars and practitioners alike.

The means through which routine heterogeneity is influenced – heuristics based on various forms of rationality – bear some similarities to the concept of ostensive routines.

However, while ostensive routines imply totalizing standardized operating procedures,

193

often explicitly defined (Feldman & Pentland, 2003; Iannacci & Hatzaras, 2011), the concept of heuristics implies the ability to dynamically reconfigure and recombine guiding micro-level structures. Therefore I also contribute to the literature on routines through showing how guidance of the performance of routines in ‘open’ contexts emerges through the use of heuristics.

Practical Contributions One of the key problems of OSS governance is that there are few things that you can do in order to influence a larger community of developers to follow a desired path or direction. The absence of authority structures and clear methodologies limit the opportunities that OSS “project managers” have to structure the collective work of a community. I contribute knowledge to practice through showing how heuristics form the basis of shaping development routines with varying degrees of information processing capacity. Therefore, various configurations of heuristics could be considered to shape emergent development routines – certain patterns of software development activities emerging from the ground up in OSS projects. Through actively shaping and disseminating specific heuristics throughout a community, core developers can influence the capacity of a project to deal with complex problems.

Limitations & Future Research This study has examined only a limited set of explanations for routine heterogeneity in the context of OSS projects, and it is likely that in addition to the mechanisms theorized herein, there are others that also influence the process of generating heterogeneity.

Additionally, other, “open source-like” contexts may offer different sets of mechanisms.

This suggests that the inquiry into the generation of important software development

194

routines needs to be taken beyond the OSS context, and be widened to include additional mechanisms as well as other relevant aspects of software development. For example, social network characteristics are utilized in many studies of OSS and other similar design contexts where activities emerge from local interactions. It would be of value to connect specific degrees of routine heterogeneity to the specific social network structures, which are created through the repeated performances of specific development routines.

195

Conclusion

This dissertation presents three studies which inquire into the ways in which OSS developers deal with the coordination of task interdependencies related to the resolution of an array of software development problems – both simple and complex. This is executed through proposing that routines perform essential information processing functions, which OSS developers draw upon. I asked three questions with regards to the origin, evolution, and variation of such routine structures. These questions have been addressed by showing how developers 1) respond to simple and complex problems through forming multiple types of routines which together constitute information processing systems, 2) how such routines vary across time based on the inherent disagreements in the community that they have to respond to, and 3) how routine structures vary across projects, driven by rationalities and associated heuristics which respond to differing material conditions. These contributions have been discussed at length in each study, and are summarized in Table 32 below.

Considering these three studies as a joint product, I can also extract a number of broader contributions to the study of coordination and information processing. These are of theoretical, methodological, and practical nature, and set the stage for a research stream on routine structuring within open innovation enabled by digital platforms. Theoretically

I contribute to our understanding how OSS developers learn to tackle complex problems by theorizing the origin, evolution and variation of various routine structures. I also highlight distinct discursive characteristics of routines that developers utilize.

196

Methodologically I show how computational analyses and qualitative inquiry can be used in a mixed fashion based by utilizing their shared inductive assumptions. This shows one way to use “big data” within an essentially qualitative approach. Practically, the results points towards the need for increased efforts to develop systems which do not only support version control and ‘remixing’ of material artifacts (Nickerson, 2014), but also supports the discursive processes through which developers tackle complex and ambiguous problems. In the end, this work sets the stage for a broader research stream, which promises to increase our understanding of the processes of open innovation.

Table 32. Contributions

Study # Contribution Contributions from individual studies Study #1: Origin Multiple forms of routines respond to both simple and complex problems Study #2: Evolution Routines adapt over time to disagreements across the release cycle Study #3: Variation Routines, guided by varying rationalities, vary across projects as responses to material conditions Component Meta-contributions Theory • Emergent methodologies-in-action coordinate task interdependencies through providing routine structures • Explaining the role of routine in generating information processing capacities • Discourse is an essential mechanism shaping routine structures • Extending IPV with a discursive layer Method • Showing a particular approach to computational-qualitative mixed methods • Showing how grounded theory can be applied to digital trace data Practice • The importance of play to facilitate collaboration • Facilitating discourse to engender coordination Theoretical Contributions This dissertation makes multiple theoretical contributions to the literatures on OSS, information processing, routines, and coordination. Perhaps the most important contribution concerns explicating the formation of what I call emergent methodologies- in-action. By this I mean the particular ways in which software development activities are

197

configured and structured over time (Humphrey, 1989), not by explicit formal mechanisms (Zmud, 1980), but rather by ongoing interactions between distributed developers who work on heterogeneous sets of development problems. The nature of the problems that a specific OSS project confronts (as generated by the environment, instantiated as requirements embedded throughout the community, and expressed in the degrees of task complexity that problems entail) generates a need for a family of distributed responses, meaning that developers react to the problems that are relevant to their local environment. As they do this, their performances crystallize into distinct routine structures – responses to the particular environments that they face.

The patterns that emerge from these interactions establish emergent methodologies-in- action for OSS and help us to provide how an ‘open’ context such as OSS can be coordinated (O’Mahony & Ferraro, 2007) – through the integration of multiple routine clusters which address both simple and complex problems. The distribution of such routine clusters shifts adaptively across time (as a project moves across the release cycle) as well as with contextual factors (such as the degree of technical complexity that a project has to cope with). Hence, an emergent methodology in action is a set of latent patterns of routine enactments, which are borne out as adaptive responses to the stream of requirements that face an OSS project at a particular time, in a particular context.

As such, this dissertation takes issue with the concept of ‘emergence’ (Schelling, 1978).

Emergence is most often used throughout the IS and organization literatures in a hand- waving fashion to indicate that the process through which a phenomenon comes into being is non-linear and generally difficult to describe and explain. In this dissertation I show how emergence works in the case of OSS ‘methodologies’ – i.e. how different

198

routine clusters emerge as responses to design problems, and how they are shaped so as to form a comprehensive system of information processing. Further, these routine systems react to various environmental factors. Responding to such factors, varying rationalities provide macro-level ‘control’ over the unified routine structure. These rationalities often engender the specific shape and degree of heterogeneity which routine structures form.

These emergent methodologies-in-action provide coordination mechanisms by attending to varying information processing needs and related capacities (Galbraith, 1973, 1974).

The latter is provided by the emergence of differentiated routine structures which provide for different information processing functions, such as conducting triage, transferring information across routines, and most importantly – various problem solving approaches which address simple and complex problems either directly or through discursive means.

As such, these routines enable OSS developers to conduct the work that they collectively have set out to do – constructing complex, shared code artifacts. The former, i.e. coordination mechanisms, are provided through linking together routine structures so as to fashion a holistic information processing system that conducts both sensing of problems (i.e. receiving, evaluating, redirecting and/or rejecting) as well as addresses faced problems. Because of the ways in which less heterogeneous routine structures are linked to more heterogeneous routine structures, problems exhibiting higher degrees of complexity can be simplified to the point where they become amenable to a solution.

A central concept of this dissertation is that of routine variety – understood as entropy and heterogeneity. While the former relates to the diversity of activity types and the attendant capability of a routine enactment to deal with uncertainty, the latter relates to

199

the recursive ordering of activities and related capabilities to deal with ambiguity.

Together these different forms of variety captures the contours of routine enactments which process different information aspects of coding problems, which developers are faced with.

Beyond parsing out the functions of entropy and heterogeneity, the three studies taken together also teaches us something about the role of variety at different levels of analysis.

Variety at the routine-level helps us discern different types of routines, as captured by studies #1 and #2. Here variety largely represents the capacity to deal with more or less complex problems. However, when measured at a project-level (as in study #3), routine variety helps us to understand the variety of routine types, rather than variety within routines themselves. Here, we see that more varied routine structures across an entire project are related to more varied material structures, and therefore broader ranges of technical problems confronting developers.

Further, a distinct component in each of the studies in this dissertation is the role of discourse. In addition to ‘superposition’ dynamics of coordination and collaboration

(Howison & Crowston, 2014), developers engage in a ‘deep’ discourse with regards to appropriate ways of solving problems, task interdependencies, technical interdependencies, development priorities, and governance of feature sets as well as architectural decisions. Further, such discursive processes are key in shaping routine structures so as to make them perform various information processing functions.

Therefore, discourse forms an important enabling mechanism driving emergent methodologies-in-action. Without such discourse, common principles, values, approaches, and decisions seem unlikely to emerge.

200

Therefore, the characterization of discursive processes that takes places throughout this dissertation is a key contribution to the understanding of how open development contexts are coordinated. Whereas modularization and superposition minimizes interdependencies to allow for concurrent work, discourse synchronizes efforts where task (and technical) interdependencies cannot be escaped. As such, discourse allows us to understand how specific forms of work (e.g. specific routine structures) emerge as responses to certain conditions (e.g. status of codebases), guided by rationalities (Simon, 1978) and associated heuristics (Hutchins, 1995).

Last, the introduction of a discursive element provides a distinct extension of IPV.

Traditionally, cognitive and information-based views of organizations tend to simplify accounts of human interaction to the point where discourse, culture, meaning, and interpretation are excluded (Bruner, 1990). However, more recent views of cognition, such as distributed cognition indicates that cognitive acts are culturally and contextually situated (Hutchins, 1995), and therefore cannot be understood as straightforward processing of unambiguous signals.

The studies herein clearly show that IPV can be extended to include discursive elements.

What we see across all three studies is that discursive problem-solving is a collectively enacted practice which generates information processing capacities which can be used to solve problems which are characterized by both high uncertainty and ambiguity – essentially highly complex problems. Because such problems rarely present themselves using clear, unambiguous, and discrete sets of information bits, they can often only be solved through practices, which can take into account great amounts of information

201

asymmetry, ambiguity, as well as partial social construction of problems and desired solutions.

Hence, we can add a discursive layer to the more traditional technocratic information processing mechanisms previously specified by IPV. Now, the facilitation and practice of discourse forms a collectively enacted mechanism for leveraging distributed competencies and knowledge resources, so as to tackle complex problems. In this sense it bears some superficial similarities to the creation of lateral relations, one of Galbraith's

(1974) original strategies for increasing information processing capabilities. However, while lateral relations are ostensively specified relational structures, discourse is a dynamic practice, which cannot be ostensively specified but rather is enacted by groups of developers with shared concerns and distributed resources. This particular extension of

IPV makes it more useful to contexts where formal organization and governance are largely absent or difficult to implement, such as OSS and other “open source-like” contexts.

Methodological Contributions This dissertation combines qualitative inquiry and computational analysis in novel ways.

In doing so, it follows a growing trend towards mixed methods research (Onwuegbuzie &

Collins, 2007) which incorporate various computational analyses with qualitative inquiry.

For example, previous instance of this approach have integrated qualitative inquiry with text mining (Tuertscher et al., 2014) or social network analysis (Leonardi, 2007). This dissertation extends such work through explicitly integrating temporally oriented analysis methods such as sequence analysis with qualitative inquiry. This is done through utilizing computational techniques and qualitative methods which share a common, inductive

202

approach to generating insights (Holland et al., 1989). This intimate integration of methods expands our capacity to generate longitudinal and dynamic accounts of how complex work processes are organized and unfold across time.

I utilize qualitative and computational analysis to inquire into the same set of data – digital trace data represented both as text as well as numbers. In the case of routines this can be done through analyzing action sequences using sequence analysis and clustering techniques, while at the same time treating discussions, code, and comments that are recorded as digital traces during the performance of the routine, as text – which can be read, coded, and interpreted in the same way as any other qualitative material (Glaser &

Strauss, 1967). This sets some of the groundwork for what may be termed a grounded theory approach to big data (I use the term “big data” here to indicate data which is too large for the tools that we have traditionally used to understand such data, cf. Berente &

Seidel, 2014).

Building on previous accounts of retroduction (Zachariadis et al., 2013) which show how qualitative inquiry can be used to explain quantitative patterns, this dissertation provides us insights into how digital archival data can be used as a rich source of qualitative insights. In most previous qualitative work, however, interviews serve as the main source of insight. While this dissertation builds on insights from 46 in-depth qualitative interviews, the main source of insight is derived from digital archival data – both in the form of the thousands of conversations which developers conducted with regards to various code contributions (as text) as well as higher-level conversations reflected in blog posts and video-recorded conference presentations (as context). These sources provided a vast and rich source of qualitative insights, which enabled me to tightly integrate

203

qualitative inquiry and computational analysis based on their dual nature (as both text and numbers) as well as the shared inductive assumptions of the methods applied (Holland et al., 1989). This allowed me to sort through large amounts of data using computational techniques (such as cluster analysis for identifying groups of similarly structured routine performances), which then served as a theoretical sampling criterion for rigorous qualitative coding that yielded rich insights that thin computational analyses cannot generate.

Practical Contributions The results of the three studies point towards the importance of encouraging developers to explore and play around with distinct problems. This builds on insights from literature on play (e.g. Nachmanovitch, 2009) and scientific discovery (Alon, 2009) which emphasize the importance of attending to the messy, discursive, and collaborative nature of problem-solving processes, even within a context such as OSS where much work is conducted in a seemingly individualized and isolated manner. Such problem solving processes essentially consist of bricolage (Garud & Karnøe, 2003; Weick, 2004) by which available problems, solutions, tools, and materials are collated and configured in ways which finally combine appropriate solutions and problems (Cohen, March, &

Olsen, 1972).

This is not to suggest that OSS managers and developers should behave in erratic and random ways, but rather that they build a sensitivity towards exploration (March, 1991): what my informants repeatedly have referred to as “playing around” with a specific problem or piece of technology. The insights from this dissertation, as well as from the broader literature on emergence and coordination (e.g. Tuertscher, Garud, &

204

Kumaraswamy, 2014) suggests that, if we allow participants to mull around and be concerned with problems they find important, then structure does in fact arise out of the messiness of local, decentralized activities performed by autonomous agents. Even a construct such as routines, which traditionally have been imagined as having a clear ostensive and a performative aspect, can here be shown to be in constant emergence from the performances of sequences of developer actions.

The information processing mechanisms which have been uncovered in this dissertation

(routines for coping with complexity as well as the rationalities which shape such routines) draw their coordinating power from the ways in which they string actions, and then routines together into systems of information processing. Utilizing the insights from this dissertation, practitioners can render tacit knowledge more explicit, so as to successfully adapt their projects to environmental stress. This may help developers to be more conscious of, and attentive to, the emergent methodologies which are forming within the projects.

A further important implication is that while systems for supporting version control and the ‘remixing’ of code (Nickerson, 2014) has made great strides in the last few years

(Sourceforge, Bitbucket, and Github are three cases in point), there is still more work to do to support the discursive processes that developers use to tackle problems which are either technically (e.g. interdependent code) or socially complex (e.g. ambiguous and/or conflicting goal sets, cf. Cataldo, Herbsleb, & Carley, 2008). The importance of discursive processes to open modes of software development uncovered in this dissertation points towards the need to fashion both social and material technologies to support and direct such discursive processes.

205

Limitations & Future Research Due to the followed multi-method approach, external validity is expected to be higher than what would have been expected using a single method. The results can be expected to describe salient features of medium-sized OSS projects. While the digital trace data is naturally limited to whatever behaviors have been recorded, such gaps can be filled in through qualitative inquiry. However, in order for these studies to gain traction within the wider IS and organizational theory communities, their relevance to a larger set of phenomena relevant to organizing must be demonstrated. Here, I am latching onto a recent intellectual movement that hints that OSS is a forerunner of more open, collaborative, and dynamic forms of organizing (Puranam et al., 2014) – potentially applicable to other endeavors than software. Examples include wiki articles (Kane et al.,

2014), music remixing (Jarvenpaa & Lang, 2011), collaborative team science (Falk-

Krzesinski et al., 2011), product development (Franke & Shah, 2003), as well as exotic examples such as the emergence of the surfboard sport and industry (von Hippel 1994).

The relevance of this dissertation to these forms of organizing only distally related to

OSS development will rest on the identification of structural features that may serve as likely conjectures for how organizing unfolds in these other domains

Hence, the ambition of this dissertation is to launch a longer and wider research program into how routines enable coordination in innovation contexts where effective coordination seems surprising or challenging. With the rise of open innovation (von

Hippel, 1994), web 2.0, and other forms of distributed innovation communities

(Jarvenpaa & Lang, 2011), as well as massive science and engineering projects

(Tuertscher et al., 2014), such questions and issues will only become more pressing.

206

While the dynamics of coordination that can be exposed through examining the context of OSS may not fully generalize to other contexts, it has often been considered to be a helpful blueprint or guidepost for understanding coordination in similar contexts

(Puranam et al., 2014).

Since routines, both emergent and formally specified, are ubiquitous in organizational contexts (Feldman & Pentland, 2003), at least parts of what we can learn from the OSS context is likely to be applicable to other contexts. Therefore this dissertation provides a steppingstone for a wider inquiry into emergent routines as a source of coordination in contexts where “open organizing” is a central governance theme (Lindberg, Gaskin,

Berente, Lyytinen, & Yoo, 2013). Such inquiry will involve both identifying common principles of coordination which hold across contexts and conditions, as well as specific moderating conditions under which different forms of routines serve as effective coordination mechanisms. To the extent which conditions may be generalized across contexts, cross-fertilization of theoretical understandings is possible.

While previous accounts of coordination have generally used a modularization, or

‘superposition’ approach (Howison & Crowston, 2014), the usage of routine structures also allowed me to inquire into how discursive patterns form over time, and how such patterns become sources of guidance to developers. Through discourse developers collaborate around complex problems, but they also leave rich traces in the institutional memory of a specific project, that helps guide future developers as to what development patterns are deemed to be appropriate (e.g. Latoza, Garlan, Herbsleb, & Myers, 2007).

Since this institutional memory is supported by the recording of digital traces that developers can freely access, there is a continuity across past and future discursive

207

processes that supports the coordination of complex task interdependencies. The patterns which I have identified in this dissertation are foundational, yet rudimentary. Additional inquiry into the role of discourse as a coordinative mode in open innovation context may yield more contextually situated insight into various forms of discourse and their relationships to coordinating diverse forms of tasks.

Further, there are opportunities for integrating social network analysis and various forms of procedurally oriented inquiries into social structure. While the scope of this dissertation has not allowed for the full integration of these theoretical perspectives and related computational methods in pursuing each perspective, it opens avenues for further research related to these perspectives. Important questions here concern how certain relational structures (e.g. community structures) can be generated by certain routines, as well as what characteristics of relational structures allow for the unfolding of different forms of routines.

Last, as new data and analysis techniques become available to researchers at an overwhelming rate, qualitative researchers need to react in an adaptive fashion that ensures that their approaches remain relevant in a new, data-intensive age. So far, quantitative researchers have enthusiastically grasped opportunities related to big data, computational social science (or as often referred to in industry contexts – data science).

I argue, however, that qualitative researchers have largely resisted these trends. Rather than responding in an adaptive fashion, they have largely attempted to argue for their traditional approach, using mainly interviews as a source of qualitative insight generated through manual coding, as a contrast to big data approaches. A more appropriate response would be to fashion new qualitative approaches suitable for the new forms of

208

data that is becoming available and use sophisticated machine learning algorithms to help sort through large volumes of data.

Here, there is a large contribution to be made. Since both machine learning and qualitative grounded theory approaches are essentially inductive approaches towards identifying latent patterns in data, they have much in common. Hence, the three studies in this dissertation stand as examples of how computational analyses and qualitative inquiry can be combined in an essentially inductive approach for generating new theory. In the coming years, formalizing and theorizing of the specific processes of how such inductive and mixed methods approaches work will prove crucial to qualitative and mixed-methods scholars alike. Those researchers and research groups that are able to combine skills in computational techniques with the sensibility required to conduct sophisticated qualitative research stand well prepared to arrive at rich intellectual insights.

209

Appendix A: Data Extraction R/SQL Query

This query was originally created here: http://www.ghtorrent.org/relational.html

# Identify all forks named rails: # SELECT * FROM projects WHERE name = "rails” # They all seemed to be forked from 1334, so I get that: # SELECT * FROM projects WHERE id = 1334 # Then I get all the pull request IDs from the base_repo (where the PRs are pulled to, i.e. rails/rails): # Loading the database # export PATH=$PATH:/usr/local/mysql/bin # wget http://ghtorrent.org/downloads/msr14-mysql.gz # mysql -u root -p # mysql> create user 'msr14'@'localhost' identified by 'msr14'; # mysql> create database msr14; # mysql> GRANT ALL PRIVILEGES ON msr14.* to msr14@'localhost'; # mysql> flush privileges; # Exit MySQL prompt # zcat msr14-mysql.gz |mysql -u msr14 -p msr14 # mysql -u msr14 -p msr14 # system(mysql -h ghtorrent -u aron@localhost -p ghtorrent < msr14-mysql.sql, wait = TRUE) library(DBI) library(RMySQL) m <- dbDriver("MySQL"); con <- dbConnect(m, user='msr14', password='msr14', host='localhost', dbname='msr14'); all_rails_projects <- dbGetQuery(con, 'SELECT * FROM projects WHERE name = "rails";') all_rails_prs <- dbGetQuery(con, 'SELECT id, pullreq_id FROM pull_requests WHERE base_repo_id = 78852;') # id_list <- fetch(res, n = -1) out <- nrow(all_rails_prs$id) names(out) <- as.list(all_rails_prs$id) df <- c('pull_req_id', 'user', 'action', 'created_at') out <- numeric(length(df)) names(out) <- df

for (i in 1:length(all_rails_prs$id)) { SQL <- paste("select user, action, created_at from ( select prh.action as action, prh.created_at as created_at, u.login as user from pull_request_history prh, users u 210

where prh.pull_request_id ='", all_rails_prs$id[i], "'", " and prh.actor_id = u.id union select ie.action as action, ie.created_at as created_at, u.login as user from issues i, issue_events ie, users u where ie.issue_id = i.id and i.pull_request_id ='", all_rails_prs$id[i], "'", " and ie.actor_id = u.id union select 'discussed' as action, ic.created_at as created_at, u.login as user from issues i, issue_comments ic, users u where ic.issue_id = i.id and u.id = ic.user_id and i.pull_request_id ='", all_rails_prs$id[i], "'", "union select 'reviewed' as action, prc.created_at as created_at, u.login as user from pull_request_comments prc, users u where prc.user_id = u.id and prc.pull_request_id ='", all_rails_prs$id[i], "'", ") as actions order by created_at;", sep = "") res<-dbGetQuery(con, SQL)

#if it's not null, add the request id and rbind it to the out dataframe if(!is.null(res)){ out<-rbind(out,cbind(rep(all_rails_prs$pullreq_id[i],nrow(res)),res)) } }

211

Appendix B: Data Extraction Ruby Script

require 'octokit' require 'csv'

client = Octokit::Client.new :login => 'my_username', :password => 'my_password' repo = 'django/django' numbers = CSV.read('/Users/Aron/Dropbox/Dissertation/3- Variance/Journal/Computational Analysis/django_c1.csv').flatten

CSV.open('django_c1_stats.csv', 'w') do |csv| csv << ["pull.number", "pull.additions", "pull.deletions", "pull.created_at", "pull.merged_at", "pull.commits", "pull.changed_files", "names_of_files_changed"] for number in numbers begin pull = client.pull_request(repo, number) files = client.pull_files(repo, number) csv << [pull.number, pull.additions, pull.deletions, pull.created_at, pull.merged_at, pull.commits, pull.changed_files, files.filename] rescue csv << [number, 0, 0, 0, 0, 0, 0, 0] next end end end

212

Appendix C: Data Extraction R Script

setwd("/Users/Aron/Dropbox/Thesis/3-Variance/Journal/Computational Analysis/compute/") library(httpuv) library(jsonlite) library(dplyr) library(plyr) library(stringr) library(igraph)

# 0. Set up the query ctx = interactive.login("client_id", "client_secret")

# This function makes sure I get the pagination right digest_header_links <- function(x) { y <- x$headers$link if(is.null(y)) { # message("No links found in header.") m <- matrix(0, ncol = 3, nrow = 4) links <- as.data.frame(m) names(links) <- c("rel", "per_page", "page") return(links) } y %>% str_split(", ") %>% unlist %>% # split into e.g. next, last, first, prev str_split_fixed("; ", 2) %>% # separate URL from the relation plyr::alply(2) %>% # workaround: make into a list as.data.frame() %>% # convert to data.frame, no factors! setNames(c("URL", "rel")) %>% # sane names dplyr::mutate_(rel = ~ str_match(rel, "next|last|first|prev"), per_page = ~ str_match(URL, "per_page=([0-9]+)") %>% `[`( , 2) %>% as.integer, page = ~ str_match(URL, "&page=([0-9]+)") %>% `[`( , 2) %>% as.integer, URL = ~ str_replace_all(URL, "<|>", "")) } modularization_query <- function(owner, repo){ # This function pulls down data on all the pull requests. pull <- function(i){ commits <- get.pull.request.commits(owner = owner, repo = repo, id = i, ctx = get.github.context(), per_page=100) links <- digest_header_links(commits) number_of_pages <- links[2,]$page if (number_of_pages != 0) try_default(for (n in 1:number_of_pages){ if (as.integer(commits$headers$`x-ratelimit-remaining`) < 5) Sys.sleep(as.integer(commits$headers$`x-ratelimit-reset`)- as.POSIXct(Sys.time()) %>% as.integer())

213

else get.pull.request.commits(owner = owner, repo = repo, id = i, ctx = get.github.context(), per_page=100, page = n) }, default = NULL) else return(commits) } list <- read.csv(paste0("/Users/Aron/dropbox/Thesis/3- Variance/Journal/Computational Analysis/compute/", repo, "_include.csv"), header = FALSE) pull_lists <- lapply(list$V1, pull)

# This is a function for getting all the correct SHAs (ignores parent and tree SHAs) sha_list <- vector("list", length(pull_lists)) for (i in 1:length(pull_lists)){ try_default(sha_list[[i]]<- pull_lists[[i]]$content[[1]]$sha, default = NULL) # possibly I need to insert a try_default here }

# this removes all the NULL values # sha_list_clean <- sha_list[ ! sapply(sha_list, is.null) ] get_commits <- function(sha){ get.commit(git = NULL, ctx = get.github.context(), owner = owner, repo = repo, sha = sha) } commit_lists0 <- lapply(sha_list, get_commits) file_list <- vector("list", length(commit_lists0)) for (i in 1:length(file_list)){ try_default(file_list[[i]]<- commit_lists0[[i]]$content$files, default = NULL) }

# Then find all the filenames using grepl grep_filenames <- function(input){ unlist(input, use.names=FALSE )[ grepl( "filename", names(unlist(input)))] } filename_lists <- lapply(file_list, grep_filenames) filename_lists <- filename_lists[!grepl("test",filename_lists)]

# 1. Iterate across the list of PR_ids & create combination edgelists combine_edge_lists <- function(filename_lists){ try_default(t(combn(filename_lists, 2)), default = NULL) } file_lists_merged <- lapply(filename_lists, combine_edge_lists)

214

# 2. Merge all combination edgelists edge_list_final <- do.call(rbind, file_lists_merged)

# 3. Calculate sna_metrics for each node g <- graph.edgelist(edge_list_final) (g) } django_graph18 <- modularization_query("django", "django") rubinius_graph18 <- modularization_query("rubinius", "rubinius") bootstrap_graph18 <- modularization_query("twbs", "bootstrap") # I may need to redo this as I may have hit the rate limit rails_graph18 <- modularization_query("rails", "rails")

# Post Processing degree_list <- degree(g) names <- names(degree_list) names(degree_list) <- NULL output <- as.data.frame(cbind(names, degree_list), stringsAsFactors = FALSE) output$degree_list <- degree_list

# 4. Calculate average sna_metric for each PR mean_f <- function(entry){ mean(output$degree_list[output$names %in% entry]) } final_output <- sapply(file_lists_merged, mean_f) names(final_output) <- as.character(list$V1) final_output[is.nan(final_output)] <- 0 hist(final_output) write.csv(final_output, file = paste0(repo, "_modularization.csv")) (final_output) hist(degree(rails), breaks = 50, ylim = c(0, 500), xlim = c(0, 750)) hist(degree(rubinius), breaks = 50, ylim = c(0, 500), xlim = c(0, 750)) hist(degree(django), breaks = 50, ylim = c(0, 500), xlim = c(0, 750)) hist(degree(bootstrap), breaks = 50, ylim = c(0, 500), xlim = c(0, 750))

215

Appendix D: Data Processing R Script

setwd("/Users/Aron/dropbox/Dissertation/1-Measurement/Analysis/") library(TraMineR) library(cluster) library(stringr) # for the find/replace function in read_seqdata_notime library(network) library(sna) library(sqldf)

# Function for reading sequence data WITH timestamps read_seqdata <- function(data, startdate, stopdate){ data <- read.table(data, sep = ",", header = TRUE) data <- subset(data, select = c("pull_req_id", "action", "created_at")) colnames(data) <- c("id", "event", "time") data <- sqldf(paste0("SELECT * FROM data WHERE strftime('%Y-%m-%d', time, 'unixepoch', 'localtime') >= '",startdate,"' AND strftime('%Y-%m-%d', time, 'unixepoch', 'localtime') <= '",stopdate,"'")) data$end <- data$time data <- data[with(data, order(time)), ] data$time <- match(data$time, unique(data$time)) data$end <- match(data$end, unique(data$end)) (data) }

# Function for reading sequence data withOUT timestamps # Also replaces "synchronized" and "subscribed" with "NA" read_seqdata_notime <- function(data, startdate, stopdate){ data <- read.table(data, sep = ",", header = TRUE) data <- subset(data, select = c("pull_req_id", "action", "created_at")) data <- subset(data, action!="synchronize") data <- subset(data, action!="subscribed") data <- subset(data, action!="unsubscribed") colnames(data) <- c("id", "event", "time") data <- sqldf(paste0("SELECT * FROM data WHERE strftime('%Y-%m-%d', time, 'unixepoch', 'localtime') >= '",startdate,"' AND strftime('%Y-%m-%d', time, 'unixepoch', 'localtime') <= '",stopdate,"'")) data.split <- split(data$event, data$id) list.to.df <- function(arg.list) { max.len <- max(sapply(arg.list, length)) arg.list <- lapply(arg.list, `length<-`, max.len) as.data.frame(arg.list) } data <- list.to.df(data.split) data <- t(data) (data) }

# Function for opening network data read_netdata <- function(data){

216

data <- read.table(data, sep = ",") data <- as.network(data[, 1:2]) (data) }

# Function for counting the number of events event_count <- function(data){ sequences.sts <- seqdef(data, left = "DEL", gaps = "DEL", right = "DEL") event.count <- seqstatf(sequences.sts) (sum(event.count$Freq)) }

# Function for storing sequence length in a variable sequence_length <- function(data){ sequences.sts <- seqdef(data, left = "DEL", gaps = "DEL", right = "DEL") sequences.length <- seqlength(sequences.sts) (sequences.length) }

# Function for calculating entropies entropies <- function(data){ sequences.sts <- seqdef(data, left = "DEL", gaps = "DEL", right = "DEL") sequences.ent <- seqient(sequences.sts, norm = FALSE) # This stores the entropies (sequences.ent) }

# Function for calculating subsequences subsequences <- function(data){ sequences.sts <- seqdef(data, left = "DEL", gaps = "DEL", right = "DEL") sub.sequences <- seqsubsn(sequences.sts, DSS = FALSE) (sub.sequences) }

# Function for generating a dissimilarity value dissimilarity <- function(data, label){ sequences.sts <- seqdef(data, left = "DEL", gaps = "DEL") ccost <- seqsubm(sequences.sts, method = "CONSTANT", cval = 2, with.missing=TRUE) sequences.OM <- seqdist(sequences.sts, method = "OM", norm = FALSE, sm = ccost, with.missing=TRUE) clusterward <- agnes(sequences.OM, diss = TRUE, method = "ward") plot(clusterward, which.plots = 2) (sequences.OM) }

# Function for least-squares normalization of subseq by length normalization <- function(order, length){ data <- as.data.frame(cbind(order, length)) model <- lm(order ~ length, data) order_normalized <- order- (model$coefficients[1]+(model$coefficients[2]*length))

217

(order_normalized) }

# Function for normalizing subsequences by max_subseq_possible normalization2 <- function(order, length){ order_normalized <- order/2^max(length) (order_normalized*1000000) }

# Creating the dataset django_sequences <- read_seqdata_notime("/Users/Aron/github/local/github- activities/data/activity-django-django.txt", '2012-01-01', '2012-06-30')

# Random sampling for the django dataset set.seed(2) ids <- sample(unique(django_sequences$id), 100) # important: the UNIQUE id numbers django_sequences <- django_sequences[django_sequences$id %in% ids, ]

# Analysis # Number of events event_count(django_sequences)

# Lengths django_length <- sequence_length(django_sequences) mean(django_length) apply(django_length, 2, sd) # Standardize by variance django_length_st <- django_length/apply(django_length, 2, sd) mean(django_length_st)

# Entropies django_entropy <- entropies(django_sequences) mean(django_entropy) apply(django_entropy, 2, sd) # Standardize by variance django_entropy_st <- django_entropy/apply(django_entropy, 2, sd) mean(django_entropy_st)

# Subsequences django_subseq <- subsequences(django_sequences) mean(django_subseq) apply(django_subseq, 2, sd) # standardization by length^2 django_subseq_normalized <- normalization2(django_subseq, django_length) mean(django_subseq_normalized) apply(django_subseq_normalized, 2, sd)

# standardize by max(subseq) django_subseq_stmax <- django_subseq/max(django_subseq) django_subseq_norm <- django_subseq/django_length

218

django_subseq_st <- django_subseq/apply(django_subseq, 2, sd) mean(django_subseq_st) django_subseq_logged <- log(django_subseq) mean(django_subseq_logged) apply(django_subseq_logged, 2, sd) # normalize by length django_subseq_logged_norm <- django_subseq_logged/django_length mean(django_subseq_logged_norm) apply(django_subseq_logged_norm, 2, sd) # standardize by variance django_subseq_logged_st <- django_subseq_logged/apply(django_subseq_logged, 2, sd) mean(django_subseq_logged_st)

# Dissimilarities django_diss <- dissimilarity(django_sequences) mean(django_diss) django_diss_mean <- apply(django_diss, 1, mean) # squish into average distance per sequence django_vector <- as.vector(django_diss) sd(django_vector) django_diss_st <- django_diss/sd(django_vector) mean(django_diss_st)

# Create correlation matrices django_correlation_matrix <- cbind(django_entropy, django_subseq_normalized, django_diss_mean) cor(django_correlation_matrix)

# Load and crunch networks read_seqdata_for_network <- function(data, startdate, stopdate){ data <- read.table(data, sep = ",", header = TRUE) data <- subset(data, select = c("pull_req_id", "user", "action", "created_at")) data <- subset(data, action!="synchronize") data <- subset(data, action!="subscribed") data <- subset(data, action!="unsubscribed") colnames(data) <- c("id", "actor", "event", "time") data <- sqldf(paste0("SELECT * FROM data WHERE strftime('%Y-%m-%d', time, 'unixepoch', 'localtime') >= '",startdate,"' AND strftime('%Y-%m-%d', time, 'unixepoch', 'localtime') <= '",stopdate,"'")) data$end <- data$time data <- data[with(data, order(time)), ] data$time <- match( data$time , unique( data$time ) ) data$end <- match( data$end , unique( data$end ) ) slmax <- max(data$time) (data) }

# Loading and analyzing the network datasets

219

django_network <- read_seqdata_for_network("/Users/Aron/github/local/github- activities/data/activity-django-django.txt", '2012-01-01', '2012-06-30') dat <- django_network # This is where you control which network you are analyzing library(plyr) dat2 <- ddply(dat, .(id), function(d){ data.frame( event = d$event[-1], from = d$actor[-NROW(d)], to = d$actor[-1], time = paste(d$time[-NROW(d)], d$time[-1], sep = "-") ) }) dat3 <- cbind(dat2["from"], dat2["to"]) write.table(dat3, "network.txt", sep=",", row.names = FALSE) read_netdata <- function(data){ data <- read.table(data, sep = ",") data <- as.network(data[, 1:2]) (data) } data2 <- read_netdata("network.txt")

(data2) network.edgecount(data2) # number of edges network.size(data2) # number of nodes lubness(data2) connectedness(data2) efficiency(data2) hierarchy(data2) plot(data2)

# Function for finding most frequent subsequences find_subseq <- function(data, limit, support){ sequences.seqe <- seqecreate(data) library(TraMineRextras) sequences.seqe <- seqefsub(sequences.seqe, pMinSupport = support) sequences.fsubseq <- seqentrans(sequences.seqe) sequences.fsb <- sequences.fsubseq[sequences.fsubseq$data$nevent > limit] sequences.fsb }

# Function for finding representative sequences find_repseq <- function(data){ slmax <- max(data$time) sequences.seqe <- seqecreate(data) sequences.seqe <- seqformat(data, from="SPELL", to="STS", begin="time", end="end", id="id", status="event", limit=slmax)

220

sequences.sts <- seqdef(sequences.seqe, left = "DEL", right = "DEL", gaps = "DEL") ccost <- seqsubm(sequences.sts, method = "CONSTANT", cval = 2, with.missing=TRUE) sequences.OM <- seqdist(sequences.sts, method = "OM", sm = ccost, with.missing=TRUE) seqrep(sequences.sts, dist.matrix = sequences.OM, criterion = "dist", nrep = NULL) }

221

Appendix E: Semi-Structured Interview Protocol

This interview intended to explore evolution of coding practices. Specifically, the interviews focused on interviewees’ personal experiences/views of these practices.

Background

• Could you tell me about your background, and how you came to work on this project? • Could you describe your process of entering the community of this particular project? • When do you usually spend your energy on the project? How many hours do you usually spend?

Coding Practices

• Could you provide an overview of events since you joined the project? Were there any notable events of importance during this period? • Were there any conversations or chats discussing what the project was going to do before the event? Where did the conversations or chats take place? Did them happen on a regular basis? Did anyone within the community take primary responsibility for documenting them? Where were they documented? • Which tasks do you take on within the project? Why do you choose those tasks? • Please walk us through a typical/recent issue or commit you submitted. o Where did your idea come from? Why did you think the problem was needed to be addressed? o What techniques did you employ to identify and analyze the problem you would like to address? What development tools or resources were used in identifying the problem? o Were there any people that you turned to for identifying the problem? Could you tell us about your interactions with other project members, particularly as it pertains to the identification of problem? How did you initiate contact, with whom you made contact, and why the process was done this way? o Could you describe how you solve the problem? What technology platforms, modeling techniques and tools were used? What were specific reasons for choosing them? o Were there any project members or any resources that you turned to for solving the problem? How did they help you in this regard? Which communication channels did you use?

222

o Did you and other project members share the same understanding of what needed to be done during the process? Were there any disagreements? If there were conflicts, how did you negotiate them and achieve consensus? Next steps

• I would like to have a continuing relationship. Would it be possible for us to talk every quarter? • What would be interesting for you to find out? What would you like to have visibility into? • Can I validate my models with you? • Who else can I talk to?

223

References

Abbott, A. (1990). A Primer on Sequence Methods. Organization Science, 1(4), 375–392.

Abbott, A. (1992). From Causes to Events: Notes on Narrative Positivism. Sociological Methods & Research, 20(4), 428–455.

Abbott, A. (1995). Sequence Analysis: New Methods for Old Ideas. Annual Review of Sociology, 21(1), 93–113.

Abbott, A., & Hrycak, A. (1990). Measuring resemblance in sequence data: An optimal matching analysis of musicians’ careers. American Journal of Sociology, 96(1), 144–185.

Alon, U. (2009). How to choose a good scientific problem. Molecular Cell, 35(6), 726–8. http://doi.org/013

Ancona, D. G., & Chong, C. L. (1992). Entrainment–cycles and synergy in organizational behavior. Alfred P. Sloan School of Management, Massachusetts Institute of Technology, 1992.

Andres, H., & Zmud, R. (2002). A contingency approach to software project coordination. Journal of Management Information Systems, 18(3), 41–70.

Anjewierden, A., & Efimova, L. (2006). Understanding Weblog Communities Through Digital Traces: A Framework, a Tool and an Example. In On the Move to Meaningful Internet Systems 2006: OTM 2006 Workshops (pp. 279–289). Berlin, Heidelberg: Springer-Verlag.

Ashby, W. R. (1956). An Introduction to Cybernetics. London, England: Chapman & Hall.

Baldwin, C. Y., & Clark, K. B. (2000). Design Rules: The power of modularity. Boston, MA: MIT Press.

Becker, M. C. (2005). The concept of routines: some clarifications. Cambridge Journal of Economics, 29(2), 249–262.

Beer, S. (1984). The Viable System Model: Its Provenance, Development, Methodology and Pathology. Journal of the Operational Research Society, 35(1), 7–25.

Benbya, H., & McKelvey, B. (2006). Toward a complexity theory of information systems development. Information Technology & People, 19(1), 12–34. 224

Benkler, Y. (2006). The Wealth of Networks: How Social Production Transforms Markets and Freedom. Yale University Press.

Berente, N., & Seidel, S. (2014). Big Data & Inductive Theory Development: Towards Computational Grounded Theory? In Americas Conference on Information Systems (AMCIS), Savannah, Georgia, August 7-10 (pp. 1–11).

Bird, C., Rigby, P., & Barr, E. (2009). The promises and perils of mining git. 2009 6th IEEE International Working Conference on Mining Software Repositories.

Blau, P., & Scott, W. (1962). Formal Organizations: A Comparative Approach. Stanford University Press.

Boehm, B. (1988). A spiral model of software development and enhancement. Computer, 21(5), 61–72.

Boland, R. J., Newman, M., & Pentland, B. T. (2010). Hermeneutical exegesis in information systems design and use. Information and Organization, 20(1), 1–20.

Bonaccorsi, A., & Rossi, C. (2006). Comparing motivations of individual programmers and firms to take part in the Open Source movement. From community to business. Knowledge, Technology & Policy, 1–33.

Brooks, F. P. (1995). The Mythical Man-Month: Essays on Software Engineering. Addison-Wesley Professional.

Bruner, J. S. (1990). Acts of meaning. booksgooglecom. Harvard University Press.

Burt, R. (2000). The network structure of social capital. In Research in Organizational Behavior (Vol. 22, pp. 345–423).

Campbell, D. (1988). Task Complexity: A Review and Analysis. The Academy of Management Review, 13(1), 40–52.

Capiluppi, A. (2004). Improving comprehension and cooperation through code structure. In 26th International Conference on Software Engineering - W8S Workshop “Collaboration, Conflict and Control: The 4th Workshop on Open Source Software Engineering” (pp. 23–28).

Capiluppi, A., Faria, A., & Ramil, J. (2005). Exploring the Relationship between Cumulative Change and Complexity in an Open Source System. In Proceedings of the Ninth European Conference on Software Maintenance and Reengineering (CSMR’05) (pp. 1–9).

Capiluppi, A., González-Barahona, J. M., Herraiz, I., & Robles, G. (2007). Adapting the “staged model for software evolution” to free/libre/open source software. Ninth

225

International Workshop on Principles of Software Evolution in Conjunction with the 6th ESEC/FSE Joint Meeting - IWPSE ’07, 79.

Cataldo, M., Herbsleb, J. D., & Carley, K. M. (2008). Socio-technical congruence: a framework for assessing the impact of technical and work dependencies on software development productivity. Proceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement - ESEM ’08, 2 – 11.

Christley, S., & Madey, G. (2007). Analysis of Activity in the Open Source Software Development Community. In 40th Annual Hawaii International Conference on System Sciences (HICSS’07) (pp. 1–10). IEEE.

Cohen, M. D., March, J. G., & Olsen, J. P. (1972). A Garbage Can Model of Organizational Choice. Administrative Science Quarterly, 17(1), 1.

Coleman, J. S. (1958). Field Methods and Techniques Relational Analysis: The Study of Social Organizations with Survey Methods. Human Organization, 17(4), 28–36.

Crowston, K. (1997). A coordination theory approach to organizational process design. Organization Science, 8(2), 157–175.

Crowston, K., & Howison, J. (2006). Hierarchy and Centralization in Free and Open Source Software Team Communications Kevin. Knowledge, Technology, & Policy, 18(4), 65–85.

Crowston, K., & Kammerer, E. (1998). Coordination and collective mind in software requirements development. IBM Systems Journal, 37(2), 227–245.

Crowston, K., & Scozzi, B. (2004). Coordination practices within FLOSS development teams: The bug fixing process. Computer Supported Activity Coordination, 4(1), 21–30.

Crowston, K., Wei, K., Howison, J., & Wiggins, A. (2012). Free/Libre Open-Source Software Development: What We Know and What We Do Not Know. ACM Computing Surveys, 44(2), 1–35.

Daft, R. L., & Lengel, R. H. (1986). Organizational Information Requirements, Media Richness and Structural Design. Management Science, 32(5), 554–571.

Daft, R. L., & Weick, K. E. (1984). Toward a Model of Organizations as Interpretation Systems. The Academy of Management Review, 9(2), 284–295.

Daniel, S., Agarwal, R., & Stewart, K. (2013). The Effects of Diversity in Global, Distributed Collectives: A Study of Open Source Project Success. Information Systems Research, 24(2), 312–333.

226

Davis, G. B., & Olson, M. H. (1984). Management information systems: conceptual foundations, structure, and development (2nd ed.).

Dewey, J. (1938). The Pattern of Inquiry. In Logic: The Theory of Inquiry. Henry Holt & Company.

DiMaggio, P., & Powell, W. (1983). The Iron Cage Revisited: Institutional Isomorphism and Collective Rationality in Organizational Fields. American Sociological Review, 48(2), 147–160.

DiMaggio, P., & Powell, W. (1991). The new institutionalism in organizational analysis. Chicago: University Of Chicago Press.

Dionysiou, D. D., & Tsoukas, H. (2013). Understanding the (Re)Creation of Routines from Within: A Symbolic Interactionist Perspective. Academy of Management Review, 38(2), 181–205.

Eisenhardt, K. M. (1989). Building Theories from Case Study Research. Academy of Management Review, 14(4), 532–550.

Eisenhardt, K. M., Furr, N. R., & Bingham, C. B. (2010). Microfoundations of Performance: Balancing Efficiency and Flexibility in Dynamic Environments. Organization Science, 21(6), 1263–1273.

Eisenhardt, K. M., & Martin, J. (2000). Dynamic capabilities: what are they? Journal, 21(10-11), 1105–1121.

Emirbayer, M., & Mische, A. (1998). What Is Agency? American Journal of Sociology, 103(4), 962–1023.

English, R., & Schweik, C. M. (2007). Identifying Success and Tragedy of FLOSS Commons: A Preliminary Classification of Sourceforge.net Projects. National Center for Digital Government, (Paper 29), 11–11.

Falk-Krzesinski, H. J., Contractor, N., Fiore, S. M., Hall, K. L., Kane, C., Keyton, J., … Trochim, W. (2011). Mapping a research agenda for the science of team science. Research Evaluation, 20(2), 145–158.

Feldman, M. S. (2003). A performative perspective on stability and change in organizational routines. Industrial and Corporate Change, 12(4), 727–752.

Feldman, M. S., & Pentland, B. T. (2003). Reconceptualizing Organizational Routines as a Source of Flexibility and Change. Administrative Science Quarterly, 48(1), 94– 121.

227

Feller, J., & Fitzgerald, B. (2000). A framework analysis of the open source software development paradigm. Proceedings of the Twenty First International Conference on Information Systems, 58–69.

Fichman, R. G., & Kemerer, C. F. (1992). Object-Oriented and Conventional Analysis and Design Methodologies Comparison and Critique. COMPUTER, October, 22–39.

Fitzgerald, B. (2006). The Transformation of Open Source Software. MIS Quarterly, 30(3), 587–598.

Folger, R., & Turillo, C. J. (1999). Theorizing as the Thickness of Thin Abstraction. The Academy of Management Review, 24(4), 742.

Franke, N., & Shah, S. (2003). How communities support innovative activities: an exploration of assistance and sharing among end-users. Research Policy, 32(1), 157–178.

Gabadinho, A., Ritschard, G., & Studer, M. (2011). Analyzing and Visualizing State Sequences in R with TraMineR. Journal Of Statistical Software, 40(4).

Gabadinho, A., Ritschard, G., Studer, M., & Nicolas, S. M. (2011). Mining sequence data in R with the TraMineR package: A user’s guide.

Galbraith, J. R. (1973). Designing Complex Organizations. Boston, MA: Addison Wesley.

Galbraith, J. R. (1974). Organization Design: An Information Processing View. Interfaces, 4(3), 28–36.

Garud, R., & Karnøe, P. (2003). Bricolage versus breakthrough: distributed and embedded agency in technology entrepreneurship. Research Policy, 32(2), 277–300.

Gaskin, J., Berente, N., Lyytinen, K., & Yoo, Y. (2014). Toward Generalizable Sociomaterial Inquiry: A Computational Approach for Zooming In and Out of Sociomaterial Routines. MIS Quarterly, 38(3), 849–871.

Gaskin, J., Thummadi, V., & Lyytinen, K. (2011). Digital Technology and the Variation in Design Routines: A Sequence Analysis of Four Design Processes. Thirty Second International Conference on Information Systems, Shanghai, 1–16.

Geertz, C. (1973). The Interpretation of Cultures: Selected Essays. Basic Books; First Edition edition.

Geertz, C. (2005). Deep play: notes on the Balinese cockfight. Daedalus, 134(4), 56–86.

228

Gersick, C. J., & Hackman, J. R. (1990). Habitual routines in task-performing groups. Organizational Behavior and Human Decision Processes, 47, 65–97.

Gibson, J. J. (1977). The Theory of Affordances. (R. Shaw & J. Bransford, Eds.)Perceiving, Acting, and Knowing. Lawrence Erlbaum.

Giddens, A. (1979). Central Problems in Social Theory: Action, Structure and Contradictions in Social Analysis. University of California Press.

Giddens, A. (1984). The Constitution of Society: Outline of the Theory of Structuration. University of California Press.

Glaser, B. G., & Strauss, A. L. (1967). The discovery of grounded theory: strategies for qualitative research. Piscataway, New Jersey: Transaction Publishers.

Godfrey, M., & Qiang, T. (2000). Evolution in Open Source Software: A Case Study. In Proceedings of the International Conference on Software Maintenance (pp. 131– 142).

Goggins, S. P., Galyen, K., & Laffey, J. (2010). Network Analysis of Trace Data for the Support of Group Work: Activity Patterns in a Completely Online Course. In Group ’10 (pp. 1–10).

Gousios, G., & Spinellis, D. (2012). GHTorrent : Github’s Data from a Firehose. MSR: 2012 9th IEEE Working Conference on Mining Software Repositories, 12–21.

Granovetter, M. (1990). The Myth of Social Network Analysis as a Special Method in the Social Sciences. Connections, 13(1-2), 13–16.

Gregor, S. (2009). Building Theory in the Sciences of the Artificial. In V. Vaishanvi (Ed.), DESRIST ’09 Proceedings of the 4th International Conference on Design Science Research in Information Systems and Technology. ACM Press.

Hærem, T., Pentland, B., & Miller, K. (2014). Task Complexity: Extending a Core Concept. Academy of Management Review, November(Online), 1–38.

Herbsleb, J., & Moitra, D. (2001). Global software Development. Software, IEEE, (March/April), 16–20.

Highsmith, J., & Cockburn, A. (2001). Agile Software Development: The Business of Innovation. Computer, September, 120–122.

Hirschheim, R., & Klein, H. H. K. (1994). Realizing Emancipatory Principles in Information Systems Development : The Case for ETHICS. MIS Quarterly, 18(1), 83–109.

229

Holland, J., Holyoak, K., Nisbett, R., & Thagard, P. (1989). Induction: Processes of Inference, Learning, and Discovery. Cambridge, MA: MIT Press.

Howison, J., & Crowston, K. (2014). Collaboration Through Open Superposition: A Theory of the Open Source Way. MIS Quarterly, 38(1), 29–50.

Humphrey, W. S. (1989). Managing the software process. Addison-Wesley Longman Publishing Co., Inc.

Hutchins, E. (1995). Cognition in the Wild. Cambridge, MA: MIT Press.

Iannacci, F., & Hatzaras, K. S. (2011). Unpacking ostensive and performative aspects of organisational routines in the context of monitoring systems: A critical realist approach. Information and Organization, 22(1), 1–22.

Jacobides, M. G., & Winter, S. G. (2012). Capabilities: Structure, Agency, and Evolution. Organization Science, 23(5), 1365–1381.

Jarvenpaa, S. L., & Lang, K. R. (2011). Boundary Management in Online Communities: Case Studies of the Nine Inch Nails and ccMixter Music Remix Sites. Long Range Planning, 44(5-6), 440–457.

Jason, W. C., Ramasubbu, N., Tschang, F. T., & Sambamurthy, V. (2013). Design Capital and Design Moves: The Logic of Digital Business Strategy. MIS Quarterly, 37(2), 537–564.

Johnson, R. B., Onwuegbuzie, a. J., & Turner, L. a. (2007). Toward a Definition of Mixed Methods Research. Journal of Mixed Methods Research, 1(2), 112–133.

Johnson, V. (2007). What Is Organizational Imprinting? Cultural Entrepreneurship in the Founding of the Paris Opera. American Journal of Sociology, 113(1), 97–127.

Kane, G. C., Johnson, J., & Majchrzak, A. (2014). Emergent Life Cycle: The Tension Between Knowledge Change and Knowledge Retention in Open Online Coproduction Communities. Management Science, 60(December), 3026–3048.

Kirsch, L. (1996). The Management of Complex Tasks in Organizations: Controlling the Systems Development Process. Organization Science, 7(1), 1–21.

Klein, H., & Myers, M. (1999). A set of principles for conducting and evaluating interpretive field studies in information systems. MIS Quarterly, 23(1), 67–93.

Koch, S. (2005). Evolution of open source software systems–a large-scale investigation. Proceedings of the First International Conference on Open Source Systems Genova, 11th-15th July 2005, 148–153.

230

Kogut, B., & Metiu, A. (2001). Open source software development and distributed innovation. Oxford Review of Economic Policy, 17(2), 248–264.

Krackhardt, D. (1994). Graph Theoretical Dimensions of Informal Organizations. In M. Prietula & K. Carley (Eds.), Computational organization theory (pp. 89–111). Hillsdale, NJ: Lawrence Erlbaum Associaties, Inc.

Kraut, R., & Streeter, L. (1995). Coordination in software development. Communications of the ACM, 38(3), 69–81.

Lakhani, K., & von Hippel, E. (2003). How open source software works:“free” user-to- user assistance. Research Policy, 32(July 2002), 923–943.

Langley, A. (1999). Strategies for theorizing from process data. Academy of Management Review, 24(4), 691–710.

Latoza, T. D., Garlan, D., Herbsleb, J. D., & Myers, B. a. (2007). Program Comprehension as Fact Finding. In ESEC-FSE’07, September 3–7, 2007, Cavat near Dubrovnik, Croatia (pp. 1–10).

Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabasi, A. L., Brewer, D., … Gutmann, M. (2009). Life in the network: the coming age of computational social science. Science, 323(5915), 721–723.

Lee, G. K., & Cole, R. E. (2003). From a Firm-Based to a Community-Based Model of Knowledge Creation: The Case of the Linux Kernel Development. Organization Science, 14(6), 633–649.

Lehman, M. (1980). Programs, Life Cycles, and Laws of Software Evolution. Proceedings of the IEEE, 68(9), 1060–1076.

Leonardi, P. (2007). Activating the Informational Capabilities of Information Technology for Organizational Change. Organization Science, 18(5), 813–831.

Lindberg, A. (2013). Understanding Change In Open Source Communities: A Co- Evolutionary Framework. In Academy of Management Meeting (pp. 1–40).

Lindberg, A., & Berente, N. (2014). Aligning Information Processing Capabilities and Gaps in Open Source Software Development. In Academy of Management Meeting, Philadelphia, PA (pp. 1–40).

Lindberg, A., Gaskin, J., Berente, N., Lyytinen, K., & Yoo, Y. (2013). Computational Approaches for Analyzing Latent Social Structures in Open Source Organizing. In Proceedings of the Thirty Fourth International Conference on Information Systems, Milan, Italy (pp. 1–19).

231

Liu, P., & Pentland, B. (2011). Dynamic capabilities and business processes: a trajectory view. In Proceedings of the Seventeenth Americas Conference on Information Systems, Detroit, MI (pp. 1–9).

Locke, K., Golden-Biddle, K., & Feldman, M. S. (2008). Making Doubt Generative: Rethinking the Role of Doubt in the Research Process. Organization Science, 19(6), 907–918.

Luthiger, B. (2005). Fun and software development. Proceedings of the First International Conference on Open Source Systems, (July), 273–278.

MacCormack, A., Rusnak, J., & Baldwin, C. Y. (2006). Exploring the Structure of Complex Software Designs: An Empirical Study of Open Source and Proprietary Code. Management Science, 52(7), 1015–1030.

March, J. (1991). Exploration and exploitation in . Organization Science, 2(1), 71–87.

March, J. G., & Simon, H. A. (1958). Organizations. New York, NY: John Wiley & Sons.

Merton, R. K. (1957). Social Theory and Social Structure. New York: Free Press.

Mintzberg, H. (1979). The structuring of organizations: A synthesis of the research. In University of Illinois at Urbana-Champaign’s Academy for Entrepreneurial Leadership Historical Research Reference in Entrepreneurship.

Mockus, A., Fielding, R. T., & Herbsleb, J. D. (2002). Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology, 11(3), 309–346.

Monge, P., & Contractor, N. (2003). Theories of Communication Networks. New York: Oxford University Press.

Monteiro, E., & Østerlie, T. (2004). Keeping it going: The everyday practices of open source software.

Nachmanovitch, S. (1990). Free play: improvisation in. New York, NY: Tarcher/Putnam.

Nelson, R. R., & Winter, S. G. (1982). An Evolutionary Theory of Economic Change. Cambridge, MA: Harvard University Press.

Nickerson, J. V. (2014). Collective Design: Remixing and Visibility. Design Computing and Cognition.

232

Nidumolu, S. (1995). The Effect of Coordination and Uncertainty on Software Project Performance: Residual Performance Risk as an Intervening Variable. Information Systems Research, 6(3), 191–219.

O’Mahony, S., & Ferraro, F. (2007). The Emergence of Governance in an Open Source Community. Academy of Management Journal, 50(5), 1079–1106.

Onwuegbuzie, A., & Collins, K. (2007). A typology of mixed methods sampling designs in social science research. The Qualitative Report, 12(2), 281–316.

Ovaska, P., Rossi, M., & Marttiin, P. (2003). Architecture as a coordination tool in multi- site software development. Software Process: Improvement and Practice, 8(4), 233– 247.

Page, S. (2010). Diversity and Complexity. Princeton, NJ: Princeton University Press.

Parnas, D. (1972). On the criteria to be used in decomposing systems into modules. Communications of the ACM, 15(12), 1053–1058.

Parsons, T. (1960). Structure and Process in Modern Societies. New York: Free Press.

Pentland, B. T. (1995). Grammatical Models of Organizational Processes. Organization Science, 6(5), 541–556.

Pentland, B. T. (2003). Sequential Variety in Work Processes. Organization Science, 14(5), 528–540.

Pentland, B. T. (2005). Organizational routines as a unit of analysis. Industrial and Corporate Change, 14(5), 793–815.

Pons, P., & Latapy, M. (2005). Computing communities in large networks using random walks. Journal of Graph Algorithms and Applications, 10(2), 191–218.

Powell, W. (1990). Neither Market nor Hierarchy: Network Forms of Organization. Research in Organizational Behavior, 12, 295–336.

Prokopenko, M., Boschetti, F., & Ryan, A. J. (2009). An information-theoretic primer on complexity, self-organization, and emergence. Complexity, 15(1), 11–28. http://doi.org/10.1002/cplx.20249

Puranam, P., Alexy, O., & Reitzig, M. (2014). What’s “New” about New Forms of Organizing? Academy of Management Review, 39(2), 162–180.

Qureshi, I., & Fang, Y. (2010). Socialization in Open Source Software Projects: A Growth Mixture Modeling Approach. Organizational Research Methods, 14(1), 208–238.

233

Ragin, C. C. (1987). The Comparative Method: Moving Beyond Qualitative and Quantitative Strategies. London, England: University of California Press.

Rajlich, V., & Bennett, K. (2000). A Staged Model for the Software Life Cycle. Computer, 33(7), 66–71.

Raymond, E. S. (2001). The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. Sebastopol, CA: O’Reilly.

Roberts, J., & Hann, I. (2006). Understanding the motivations, participation, and performance of open source software developers: A longitudinal study of the Apache projects. Management Science, 52(7), 984–999.

Robles, G., Amor, J. J., Gonzalez-barahona, J. M., Herraiz, I., Rey, U., & Carlos, J. (2005). Evolution and Growth in Large Libre Software Projects. In Proceedings of the 2005 Eighth International Workshop on Principles of Software Evolution (IWPSE’05) (pp. 1–10).

Royce, W. (1970). Managing the Development of Large Software Systems. Proceedings of IEEE WESCON, (August), 1–9.

Sabherwal, R. (2003). The evolution of coordination in outsourced software development projects: a comparison of client and vendor perspectives. Information and Organization, 13(3), 153–202.

Sabherwal, R., & Robey, D. (1993). An empirical taxonomy of implementation processes based on sequences of events in information system development. Organization Science, 4(4), 548–576.

Salvato, C. (2009). Capabilities Unveiled: The Role of Ordinary Activities in the Evolution of Product Development Processes. Organization Science, 20(2), 384– 409.

Scacchi, W. (2001). Issues and Experiences in Modeling Open Source Software Development Processes. In 3rd ICSE Workshop on Open Source Software Engineering (pp. 1–5). Portland, OR.

Scacchi, W. (2009). Understanding Requirements for Open Source Software. In K. Lyytinen, P. Loucopoulos, J. Mylopoulos, & B. Robinson (Eds.), Design Requirements Engineering: A Ten-Year Perspective (pp. 467–494). Berlin, Germany: Springer.

Schelling, T. (1978). Sorting and mixing: race and sex. In T. Schelling (Ed.), Micromotives and Macrobehaviour (pp. 137–166). W. W. Norton & Company.

234

Schroeder, R. G., Linderman, K., Liedtke, C., & Choo, A. S. (2008). Six Sigma: Definition and underlying theory. Journal of Operations Management, 26(4), 536– 554.

Shah, S. K. (2006). Motivation, Governance, and the Viability of Hybrid Forms in Open Source Software Development. Management Science, 52(7), 1000–1014.

Shannon, C. (1948). A Mathematical Theory of Communication. The Bell System Technical Journal, XXVII(3), 379–423.

Simon, H. A. (1962). The Architecture of Complexity. Proceedings of the American Philosophical Society, 106(6), 467–482.

Simon, H. A. (1978). Rationality as Process and as Product of Thought. The American Economic Review, 68(2), 1–16.

Simon, H. A. (1986). Rationality in Psychology and Economics. Journal of Business, 59(4), 209–224.

Simon, H. A. (1996). The Sciences of the Artificial. Boston, MA: MIT Press.

Singh, P., Tan, Y., & Mookerjee, V. (2011). Network Effects: The Influence of Structural Capital on Open Source Project Success. MIS Quarterly, 35(4), 813–829.

Studer, M. (2013). WeightedCluster Library Manual: A practical guide to creating typologies of trajectories in the social sciences with R. Geneva, Switzerland.

Studer, M., Ritschard, G., Gabadinho, A., & Müller, N. S. (2011). Article Discrepancy Analysis of State Sequences Discrepancy Analysis of State Sequences. Sociological Methods and Research, 40(3), 471–510.

Teece, D. J., Pisano, G., & Shuen, A. (1997). Dynamic Capabilities and Strategic Management. Strategic Management Journal, 18(7), 509–533.

Thornton, P. H., Ocasio, W., & Lounsbury, M. (2012). The Institutional Logics Perspective: A New Approach to Culture, Structure, and Process. Oxford University Press.

Tuertscher, P., Garud, R., & Kumaraswamy, A. (2014). Justification and Interlaced Knowledge at ATLAS, CERN. Organization Science, 25(6), 1579–1608.

Vaast, E., & Walsham, G. (2011). Grounded theorizing for electronically mediated social contexts. European Journal of Information Systems, 22(1), 9–25. http://doi.org/10.1057/ejis.2011.26

235

Venkatesh, V., Brown, S., & Bala, H. (2013). Bridging the Qualitative-Quantitative Divide: Guidelines for Conducting Mixed Methods Research in Information Systems. MIS Quarterly, 37(1), 21–54.

Venkatraman, N. (1989). The Concept of Fit in Strategy Research: Toward Verbal and Statistical Correspondence. Academy of Management Review, 14(3), 423–444.

Venkatraman, N., & Camillus, J. (1984). Exploring the Concept of “Fit” in Strategic Management. Academy of Management Review, 9(3), 513–525.

Vidgen, R., & Wang, X. (2009). Coevolving Systems and the Organization of Agile Software Development. Information Systems Research, 20(3), 355–376.

Von Hippel, E. (1994). Innovation by User Communities: Learning from Open-Source Software. MIT Sloan Management Review.

Von Hippel, E., & von Krogh, G. (2003). Open Source Software and the “Private- Collective” Innovation Model: Issues for Organization Science. Organization Science, 14(2), 209–223.

Von Krogh, G., Spaeth, S., & Haefliger, S. (2005). Knowledge Reuse in Open Source Software: An Exploratory Study of 15 Open Source Projects. In Proceedings of the 38th Hawaii International Conference on System Sciences (pp. 1–10).

Von Krogh, G., Spaeth, S., & Lakhani, K. R. (2003). Community, joining, and specialization in open source software innovation: a case study. Research Policy, 32(7), 1217–1241.

Wasserman, S., & Faust, K. (1994). Social Network Analysis: Methods and Applications. Cambridge, England: Cambridge University Press.

Watts, D. J. (2007). A twenty-first century science. Nature, 445(7127), 489.

Weber, S. (2005). The Success of Open Source. Harvard University Press.

Weick, K. (2004). Designing for Thrownness. In R. Boland & F. Collopy (Eds.), Managing as Designing (pp. 74–78). Stanford University Press.

Weick, K. E. (1979). The Social Psychology of Organizing (Vol. 2nd). Addison-Wesley.

Weick, K. E. (1995). Sensemaking in Organizations. Thousand Oaks, CA: SAGE Publications.

Wellman, B., & Berkowitz, S. (1988). Social Structures: A Network Approach. Cambridge, England: Cambridge University Press.

236

Wiggins, A., & Crowston, K. (2010). Reclassifying Success and Tragedy in FLOSS Projects. Open Source Software: New Horizons, 1–15.

Wilson, C. (2001). Activity patterns of Canadian women: Application of ClustalG sequence alignment software. In Transportation Research Board 80th Annual Meeting (Vol. 1777, pp. 1–27). Trans Res Board.

Windelband, W. (1904). Geschichte und Naturwissenschaft. Strassburg: Heitz & Mundel.

Winograd, T. (1987). A language/action perspective on the design of cooperative work. Human-Computer Interaction, 3, 3–30.

Winograd, T., & Flores, F. (1986). Understanding Computers and Cognition: A New Foundation for Design. Intellect Books.

Wood, R. (1986). Task complexity: Definition of the construct. Organizational Behavior and Human Decision Processes, 37(1), 60–82.

Yin, R. K. (2008). Case Study Research: Design and Methods. Thousand Oaks, CA: Sage Publications.

Yoo, Y., Henfridsson, O., & Lyytinen, K. (2010). Research Commentary - The New Organizing Logic of Digital Innovation: An Agenda for Information Systems Research. Information Systems Research, 21(4), 724–735.

Zachariadis, M., Scott, S., & Barrett, M. (2013). Methodological Implications of Critical Realism for Mixed-Methods Research. MIS Quarterly, 37(3), 855–879.

Zmud, R. (1980). Management of large software development efforts. MIS Quarterly, 4(2), 45–55.

237