City University of New York (CUNY) CUNY Academic Works

Publications and Research Hunter College

2020

Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API Usages, and Differences

Mehdi Bagherzadeh Oakland University

Nicholas Fireman Oakland University

Anas Shawesh Oakland University

Raffi. T Khatchadourian CUNY Hunter College

How does access to this work benefit ou?y Let us know!

More information about this work at: https://academicworks.cuny.edu/hc_pubs/664 Discover additional works at: https://academicworks.cuny.edu

This work is made publicly available by the City University of New York (CUNY). Contact: [email protected] 1

1 Actor Concurrency Bugs in Akka: A Comprehensive Study 2 on Symptoms, Root Causes, API Usages, and Differences 3 4 5 MEHDI BAGHERZADEH, Oakland University, USA 6 NICHOLAS FIREMAN, Oakland University, USA 7 ANAS SHAWESH, Oakland University, USA 8 RAFFI KHATCHADOURIAN, City University of New York (CUNY) Hunter College, USA 9 10 Actor concurrency is becoming increasingly important in the development of real-world software systems. Although actor concurrency may be less susceptible to some multithreaded concurrency 11 bugs, such as low-level data races and deadlocks, it comes with its own bugs that may be different. 12 However, the fundamental characteristics of actor concurrency bugs, including their symptoms, root 13 causes, API usages, examples, and differences when they come from different sources are still largely 14 unknown. Actor software development can significantly benefit from a comprehensive qualitative 15 and quantitative understanding of these characteristics, which is the focus of this work, to foster 16 better API documentation, development practices, testing, debugging, repairing, and verification 17 frameworks. To conduct this study, we take the following major steps. First, we construct a set 18 of 186 real-world Akka actor bugs from Stack Overflow and GitHub via manual analysis of 3,924 19 Stack Overflow questions, answers, and comments and 3,315 GitHub commits, messages, original 20 and modified code snippets, issues, and pull requests. Second, we manually study these actorbugs 21 and their fixes to understand and classify their symptoms, root causes, and API usages. Third,we study the differences between the commonalities and distributions of symptoms, root causes, and 22 API usages of our Stack Overflow and GitHub actor bugs. Fourth, we discuss real-world examples 23 of our actor bugs with these symptoms and root causes. Finally, we investigate the relation of 24 our findings with those of previous work and discuss their implications. A few findings ofour 25 study are: ❶ symptoms of our actor bugs can be classified into five categories, with Error as the 26 most common symptom and Incorrect Exceptions as the least common, ❷ root causes of our actor 27 bugs can be classified into ten categories, with Logic as the most common root cause and Untyped 28 Communication as the least common, ❸ a small number of Akka API packages are responsible for 29 most of API usages by our actor bugs, and ❹ our Stack Overflow and GitHub actor bugs can differ 30 significantly in commonalities and distributions of their symptoms, root causes, and API usages. 31 While some of our findings agree with those of previous work, others sharply contrast. 32 CCS Concepts: • Theory of computation → Concurrency; • General and reference → Empirical 33 studies. 34 Additional Key Words and Phrases: Akka actor bugs, Actor bug symptoms, Actor bug root causes, 35 Actor bug API usages, Actor bug differences, Stack Overflow, GitHub. 36 ACM Reference Format: 37 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian. 2020. Actor 38 Concurrency Bugs in Akka: A Comprehensive Study on Symptoms, Root Causes, API Usages, 39 and Differences. Proc. ACM Program. Lang. 1, OOPSLA, Article 1 (November 2020), 33 pages. 40 https://doi.org/10.1145/nnnnnnn.nnnnnnn 41 42 1 INTRODUCTION 43 Actor concurrency is becoming increasingly important in the development of real-world 44 software systems, which are built using industrial-strength actor programming frameworks 45 Authors’ addresses: Mehdi Bagherzadeh, Oakland University, USA, [email protected]; Nicholas 46 Fireman, [email protected], Oakland University, USA; Anas Shawesh, Oakland University, USA, 47 [email protected]; Raffi Khatchadourian, City University of New York (CUNY) Hunter College, USA, 48 [email protected]. 49 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:2 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

50 and languages, such as Akka [72], Orleans [82], and Erlang [24]. For example, Akka actors 51 allow PayPal to serve more than a billion financial transactions per day [75], the Spark big 52 data ecosystem to shuffle hundreds of terabytes of17 data[ ], and Groupon to provide real-time 53 personalized coupons to 48 million customers [74]. Twitter, LinkedIn, HP, Samsung, Walmart, 54 Verizon, CapitalOne, and Weight Watchers are among other users of Akka actor concurrency 55 [73]. Unlike multithreaded concurrency, in which threads communicate using shared memory 56 and locks, in actor concurrency, actors communicate using asynchronous message [18, 19]. 57 The use of higher-level actors and messages—instead of lower-level threads and locks—makes 58 actor concurrency less susceptible to some of the standard bugs in multithreaded concurrency, 59 such as low-level data races and deadlocks [63]. However, actor concurrency comes with its 60 own bugs that are different from multithreaded concurrency bugs. 61 There is previous work on the classification of actor bugs [51, 77], as well as testing 62 [67, 93], debugging [35, 66, 78, 101], and verification [27, 28, 37–39, 41, 44, 47, 61, 85, 63 87, 95, 97, 99] of actor software. Although this work advances our knowledge of actor 64 bugs, the fundamental characteristics of actor concurrency bugs, including their symptoms, 65 root causes, API usages, examples, and differences when they come from different sources 66 are still largely unknown. Actor software development can significantly benefit from a 67 comprehensive quantitative and qualitative understanding of these characteristics of actor 68 bugs to foster better API documentation, development practices, testing, debugging, repairing, 69 and verification frameworks. For example, static bug mitigation tools tend to focus onbugs 70 that are classified by their root causes and can use actor bugs root causes and their 71 classification [49]. The same is true for dynamic bug mitigation tools that target bugs that 72 are classified by their symptoms. Symptoms, root causes, and fixes are the main criteriafor 73 defining bug classes that bug mitigation tools target[29, 83]. In this work, we present the 74 first comprehensive study on these characteristic of real-world actor concurrency bugsin 75 Akka and answer the following research questions: 76 ∙ RQ1: Symptoms of actor concurrency bugs in Akka What are the symptoms of real- 77 world Akka actor bugs and their classification? What are the real-world examples of 78 actor bugs with these symptoms? How common are these symptoms? 79 ∙ RQ2: Root causes of actor concurrency bugs in Akka What are the root causes of 80 Akka actor bugs and their classification? What are the real-world examples of actor 81 bugs with these root causes? How common are these root causes? 82 ∙ RQ3: API usages of actor concurrency bugs in Akka What APIs do Akka actor bugs 83 use? How common are these APIs? 84 ∙ RQ4: Differences of actor concurrency bugs in Akka How different are the symptoms, 85 root causes, and API usages for Akka actor bugs in Stack Overflow and GitHub? How 86 different are their commonalities? How different are their distributions? 87 We conduct our study on Akka actor bugs that we construct using Stack Overflow and 88 GitHub. Stack Overflow is the most common question & answer website, with morethan 89 18 million questions, 28 million answers, 72 million comments, and 4 million developer 90 participants [94]. Akka actor bugs in Stack Overflow give us insight about bugs for which 91 developers may ask questions and receive help in fixing. Similarly, GitHub is the most 92 common code repository for open-source software, with more than 100 million projects, 900 93 million commits, and 40 million developers [43, 55, 100]. Akka actor bugs in GitHub give 94 us insight about bugs that developers leave in their code and later find and fix. We focus 95 on Akka as it is growing faster than other popular industrial-strength actor frameworks 96 and languages, such as Orleans [82] and Erlang [24]. For the past five years, there are 7,291 97 Akka questions in Stack Overflow, 1.6 and 48 times more than 4,544 and 152 Erlang and 98 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:3

99 Orleans questions, respectively. Similarly, there are 14,098 Akka projects in GitHub, 1.2 and 100 10.6 times more than 11,312 and 1,325 Erlang and Orleans projects, respectively. 101 To conduct this study, we take the following major steps. First, we construct a set of 186 102 real-world Akka actor from Stack Overflow and GitHub via manual analysis of 3,924 Stack 103 Overflow questions, answers, and comments and 3,315 GitHub commits, messages, original 104 and modified code snippets, issues, and pull requests. Second, we manually study these actor 105 bugs and their fixes to understand and classify their symptoms, root causes, and API usages. 106 Third, we study the differences between the commonalities and distributions of symptoms, 107 root causes, and API usages of our actor bugs in Stack Overflow and GitHub. Fourth, we 108 discuss real-world examples of actor bugs with these symptoms and root causes. Finally, we 109 investigate the relation of our findings with previous work and discuss their implications. 110 A few findings of our study are: ❶ symptoms of our actor bugs can be classified into five 111 categories Error, Unexpected Behavior, Incorrect Messaging, Incorrect Termination, and 112 Incorrect Exceptions, where Error is the most common symptom (36.1%) and Incorrect 113 Exceptions is the least common (2.2%), ❷ root causes of our actor bugs can be classified 114 into 10 categories Logic, Race, API Confusion, Explicit Life cycle, Programming, Messaging 115 Patterns, Model Confusion, Misnaming, Misconfiguration, and Untyped Communication, 116 where Logic is the most common root cause (19.4%) and Untyped Communication and is 117 least common (1.1%), ❸ a small number of Akka API packages (2.7%) are responsible for 118 most of API usages (97.7%) by our actor bugs, and ❹ our Stack Overflow and GitHub actor 119 bugs can differ significantly in the commonality and distribution of their symptoms, root 120 causes, and API usages. While some of our findings agree with those of previous work, others 121 sharply contrast. For example, ❺ our finding that Logic is the most common root cause 122 for our actor bugs agrees with that of Hedden and Zhao [51]. In contrast, ❻ our symptoms 123 Incorrect Messaging and Incorrect Exceptions, that are symptoms for 27.5% of our actor 124 bugs, and our root causes API Confusion, Model Confusion, Misnaming, Misconfiguration, 125 and Untyped Communication, that are root causes for 31.9% of our actor bugs, are new and 126 cannot be found in previous work. Finally, we show the implications of our findings, .g., 127 ❼ Akka bug finding tools and techniques can be enhanced by our new symptoms androot 128 causes. 129 All the data used in this study are publicly available at https:// tinyurl.com/ y4rkwhco 130 under a Creative Commons Attribution 4.0 International license. 131 132 2 METHODOLOGY 133 In this section, we discuss our methodology for the collection and analysis of our Akka actor 134 bugs from Stack Overflow and GitHub. 135 136 2.1 Data Collection 137 Akka actor bugs in Stack Overflow In Stack Overflow, an “asker” developer asks a question 138 and assigns one to five tags to the question to better specify its topic. “Answerer” developers 139 provide answers or comments for the question, and the asker selects one of these answers 140 as the accepted answer. Questions and answers may include code in their body. There are 141 18,947,469 questions, 28,717,692 answers, and 72,782,793 comments in Stack Overflow94 [ ], 142 as of December 2019, that are asked and answered by 4,697,218 developer participants. 143 To identify Akka actor bugs in Stack Overflow, we take the following steps. First, we 144 identify 1,001 Akka actor questions—those with tags [Akka] and [Actor]. Second, we identify 145 704 actor coding questions—those that include code. Third, we manually analyze these 146 704 coding questions, all their 926 answers, and 2,294 comments to identify actor bug 147 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:4 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

148 questions and fix answers. In total, we analyze 3,922 questions, answers, and comments. An 149 actor coding question is considered to be a “bug” question if its asker explicitly identifies a 150 diversion from the expected behavior of its code. The accepted answer for this bug question 151 is considered to be a “bug fix” answer if its answerer explicitly identifies a solution forthe 152 unexpected behavior of the code in the question. For this manual analysis, the second and 153 third authors individually analyze actor coding questions, answers, and comments, and 154 reiterate and refine until they agree on the set S of actor bug questions and fix answers. 155 The first and fourth authors individually analyze, reiterate and refine until they agreeand 156 verify S . Our dataset S includes 130 Akka actor bugs and their fixes from Stack Overflow. 157 The first author is a Software Engineer and Programming Languages professor with 158 extensive expertise in both actor and multithreaded concurrent systems [20, 25, 27, 28, 76]. 159 The second and third authors are Ph.D. students with extensive coursework in actor, 160 multithreaded and cloud systems. The fourth author is a Software Engineer professor with 161 extensive expertise in parallel and streaming systems [57, 58]. All authors have several years 162 of industrial work experience. 163 Our data collection steps are in line with the best practices of previous work. Previous 164 work uses Stack Overflow tags, coding questions, and manual analysis often to identify big 165 data [26], concurrency [20], security [81, 104], and deep learning [53, 54] bugs and questions 166 and answers. We write our code to process Stack Exchange Data Dump [94] and its XML 167 files to identify Akka actor and coding questions. Code snippets are marked with XMLtags 168 ⟨푐표푑푒⟩⟨/푐표푑푒⟩ in the body of questions and answers. Stack Exchange Data Dump is publicly 169 available and covers a time span of over 11 years, from August 2008 to December 2019. 170 Akka actor bugs in GitHub In GitHub, a developer creates a repository to host a project 171 and saves changes to the project using commits. A commit includes a message that specifies 172 its purpose and changes to the original code. Another developer can open an issue to report 173 a bug or ask for a new feature or open a pull request to ask for the integration of their code 174 changes into the project. There are 83,624,114 projects, 930,401,807 commits and 67,442,279 175 issues in GitHub [100], as of April 2020, that are written by 24,154,883 developers. 176 To identify Akka actor bugs and fixes in GitHub, we take the following steps. First, 177 we identify 10,832 Scala and Akka repositories—those with the keyword “Akka” in 178 their names or descriptions and main languages Scala or Java. We focus on Scala and Java 179 because they are the most common programming languages for Akka. There are 8,741 Scala 180 and 2,498 Java Akka repositories in comparison to 964 C# and 12 C++ Akka repositories. 181 Second, we identify 121 mature Akka repositories that have at least five stars and five 182 contributors [33, 45, 84] and code that includes the import statement import akka.actor*. 183 An Akka project must import the classes in the package akka.actor to work with actors. 184 Third, we extract all 39,442 commits of Akka repositories and stem the words in their 185 messages. Stemming reduces a word to its base and allows for grouping and similar treatment 186 of words with the same base. For example, the words “fixing”, “fixes”, and “fixed” all stem 187 from the base “fix”. Fourth, we identify 3,315 candidate bug commits whose messages include 188 keywords “error,” “bug,” “fix,” “issue,” “mistake,” “incorrect,” “fault,” “defect,” and “flaw” 189 [36, 60, 88]. Fifth, we manually analyze these 3,315 commits, their messages, original and 190 modified code snippets, issues, and pull requests to identify actor bugs andfixes. 191 A GitHub commit is an actor “bug and fix” commit if its message, issues, or pull requests 192 describe a bug in its original code and describe a fix in its modified code where the modified 193 code makes changes to either Akka actor (sub)classes or usages of actor instances. For this 194 manual analysis, the second and third authors individually analyze commits, their messages, 195 original and modified code, pull requests, and issues and reiterate and refine until theyagree 196 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:5

197 on the set G of bug and fix commits. The first and fourth authors individually analyze, 198 reiterate and refine until they agree and verify G . 199 Our dataset G of Akka actor bugs includes 56 actor bug commits and their fixes from 200 GitHub. These bugs belong to 12 repositories for Gatling, Spray, PayPal/squbs, akka- 201 persistence-mongo, oracle/wookiee, NewMotion/akka-rabbitmq, kamon-io/kamon-akka, 202 horta-hell, dn4s, amient/affinity, gearpump, MessageClassifier, and akka-cluster-etcd projects. 203 These projects cover a broad spectrum of functionalities ranging from load testing to web 204 service development to cloud-based messaging. 205 Our data collection steps are in line with the best practices of previous work. Previous 206 work uses keywords, maturity criteria, and manual analysis often to identify actor [51] and 207 non-actor bug and fix commits36 [ , 53, 54, 60, 88] in mature GitHub projects [33, 45, 84]. We 208 use Natural Language ToolKit (NLTK) stemming algorithm [12], GitHub Search, GitHub 209 Code Search, and its REST API during the steps above. 210 Akka actor bug dataset Our Akka actor bug dataset B includes 186 bugs and their fixes, 211 which is the union of 130 Stack Overflow actor bugs in S and 56 GitHub actor bugs in G . 212 The scale of our bug dataset is in line with those of previous work. For example, Hedden 213 and Zhao [51] study 126 Akka actor bugs selected from 12 projects, Torres Lopez et al. [77] 214 study 23 actor bugs from 11 literary works, and Zhang et al. [106] study 87 Stack Overflow 215 and 88 GitHub TensorFlow deep learning bugs. We include both run-time and compile-time 216 bugs in B. The characteristics of these bugs are complementary, and the inclusion of both 217 allows us to obtain a comprehensive understanding of the characteristics of our actor bugs. 218 89.2% of our bugs occur during the execution and 10.8% during the compilation. 219 This inclusion is in line with the practices of previous work, e.g., Islam et al. [53, 54] 220 include both run-time and compile-time bugs in their dataset of deep learning bugs. 221 222 2.2 Data Analysis 223 RQ1 and RQ2: Symptoms and root causes To answer RQ1 and RQ2, we manually analyze 224 each actor bug and fix in B. For a Stack Overflow bug and fix question and answer, we 225 analyze the question, including its text and code snippets, to identify and classify the 226 symptom of the bug. Similarly, we analyze all the answers of the question, including its 227 accepted answer, and their comments to identify the root cause of the bug. For a GitHub 228 bug and fix commit, we analyze the commit, its message, original code, and all itsissues 229 and pull requests to identify and classify the symptom of the bug. Similarly, we analyze the 230 commit, its message, original and modified code, issues, and pull requests to identify the 231 root cause of the bug. We use the open card sort [42] to classify symptoms and root causes. 232 In the open card sort, there are no predefined symptom and root cause categories; instead, 233 the categories are developed during the sorting process. During the sorting process, the first 234 three authors individually assign categories to and classify symptoms and root causes of 235 actor bugs and reiterate and refine until they all agree. 236 Our data analysis steps are in line with the best practices of previous work. Previous work 237 uses manual analysis and card sort often to classify concurrency [20] and big data [26] Stack 238 Overflow questions and answers, symptoms, root causes, and fixes for deep learning bugs 239 [53, 54], and root causes for actor bugs [51, 77] and multiple reopenings of bugs [108]. 240 RQ3: API usages To answer RQ3, we manually analyze the code snippets for each actor 241 bug and fix in B and collect the qualified names of Akka API classes for objects that 242 are used in method invocations as receivers. In Akka, a message send is implemented as 243 a method invocation. For a Stack Overflow bug and fix question and answer, we analyze 244 all the code snippets in the question and its accepted answer. Similarly, for a GitHub bug 245 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:6 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

246 and fix commit, we analyze the original code of the bug and modified code of thefix.To 247 disambiguate classes that have the same name but can exist in different packages, such as 248 Failure in akka.actor.Status and scala.util packages, we study the context around the 249 usage of the class in the code. The first three authors individually analyze the code snippets 250 to find API usages and reiterate and refine until they agree. 251 RQ4: Differences To answer RQ4, we investigate the differences of our actor bugs in 252 Stack Overflow and GitHub for commonalities and distributions of their symptoms, root 253 causes, and API usages. For commonality, we compare the percentages of symptoms, root 254 causes, and API usages between Stack Overflow and GitHub bugs. For distribution, weuse 255 a statistical t-test and a Mann-Whitney U test to investigate if distributions of symptoms, 256 root causes, and API usages are the same for Stack Overflow and GitHub bugs. We report 257 the results that are confirmed by both of these tests. 258 259 3 BACKGROUND 260 In this section, we discuss the basics of the actor concurrency model and its implementation 261 with additional features in Akka actor framework. 262 263 3.1 Basic Actor Concurrency 264 Unlike multithreaded concurrency, where a program is modeled as a set of threads that 265 communicate using shared memory and locks, in basic actor concurrency [18, 19], a program 266 is modeled as a set of actors that communicate by sending, receiving, and processing 267 asynchronous messages. An actor has its thread of execution and behavior and makes its 268 state accessible only through its messages. To send a fire-and-forget message, a “sender” 269 actor sends the message without waiting or blocking for its response. To receive a message, 270 a “receiver” actor enqueues the message by adding it to the end of its mailbox. Similarly, to 271 process a message, the actor dequeues a message by removing it from the beginning of its 272 mailbox and executes it sequentially to the end before processing the next message in the 273 mailbox. Messages are processed one by one and in the order they are received. During the 274 processing of a message, an actor can change state, change behavior, send a message, or 275 create a new actor. 276 277 3.2 Akka Actor Concurrency 278 To allow for the development of real-world actor software, Akka implements and adds several 279 necessary features to the basic . These features include additional message sending 280 and receiving patterns, parental supervision, failure management, life cycle management, 281 serialization, remoting, clustering, scheduling, dispatching, and testing. 282 Messaging patterns Using messaging patterns, in addition to an asynchronous fire-and- 283 forget message, a sender actor can send a request-response message and wait and block for 284 its response in a future variable [48] for a timeout period. A “future” is a placeholder for 285 an incomplete task with a result that is not yet ready; it will be ready when the future is 286 “complete.” Similarly, a receiver can forward a message it receives, or route it using different 287 strategies, such as round-robin and broadcast. If a “receiver” actor receives a message that it 288 cannot process using its current behavior, it could stash the message in a temporary buffer 289 and processes it when the actor changes to the appropriate behavior. If a message cannot be 290 delivered to its original receiver actor, such as a receiver that is terminated, Akka delivers 291 the message to a special receiver actor /deadLetters. 292 Parental supervision With parental supervision, a child actor can only be created by a 293 parent actor. The parent is responsible to supervise its children and manage their failures. 294 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:7

295 An actor fails when it throws an exception during its creation or message processing. The 296 supervising strategy of a supervisor specifies if the supervisor should resume, restart, or stop 297 the child or escalate the failure to the supervisor of the supervisor. An actor is created in a 298 parental hierarchy and resides in an actor system that provides services, such as scheduling, 299 dispatching, and configuration. 300 Actor life cycle With life cycles, an actor can be programmatically created and shut 301 down. An actor can watch the life cycle of another actor and receive messages, such as a 302 termination message, when the watched actor goes through different stages of its life cycle. 303 Remoting and clustering With remoting, actors that are in separate Java Virtual Machines 304 (JVMs) can send and receive serialized messages. An actor resides in an actor system, which— 305 in turn—reside in a JVM. Several actor systems can reside in one JVM. With clustering, 306 individual actor systems can form a cluster. 307 Schedulers and dispatchers A “scheduler” schedules the execution of a message send or 308 a task to occur at or during a specific time or time period. A dispatcher assigns a thread to 309 an actor for its execution and processing of its messages. The dispatcher has an executor 310 that provides the thread from a thread pool that it maintains. 311 312 4 SYMPTOMS

313 In this section, we answer RQ1 and discuss 70 67 60 314 the symptoms of our actor bugs, their clas- 53 315 sification and commonalities, and provide 50 47 316 real-world examples of actor bugs with these 40 average = 37.2 317 symptoms and how developers discuss these 30 318 symptoms. We also compare our symptoms Number of Number bugs 20 15

319 4 with those of actor bugs from previous work 10 320 by Bianchi et al. [32]. For space reasons, we 36.1% 28.5 25.3 8.1 0 2.2% 321 adapt our bug examples from Stack Over- Error Behavior Messaging Termination Exception 322 flow bugs, which are more likely to include Symptoms 323 minimal code examples [10]. Our examples Fig. 1. Symptoms of our actor bugs. 324 are written in Scala. 325 Figure1 shows the symptoms of our actor bugs, their numbers, and commonalities in 326 terms of percentages. According to this figure, the symptoms of our actor bugs canbe 327 classified into the following 5 categories, with decreasing commonalities: Error, Unexpected 328 Behavior, Incorrect Messaging, Incorrect Termination, and Incorrect Exceptions. Among 329 these, Error is the most common symptom (36.1%) and Incorrect Exceptions is the least 330 common (2.2%). 331 332 4.1 Symptom 1: Error 333 Error is the most common symptom (36.1%) for actor our bugs. The impact of a bug with 334 this symptom is that an actor or its enclosing application throws an error or exception 335 during its execution or compilation. Errors cover a broad spectrum and range from out of 336 memory and timeout errors to non-unique actor name errors. 337 Figure2 shows a buggy actor program with an Error symptom. Here, the developer 338 intends to calculate the number Pi, using Master and Worker actors. Master divides the 339 calculation of Pi between its worker actors, of type Worker, and accumulates their partial 340 calculations. Master receives a Calc message and to process it creates numWorker number 341 of workers and sends numMessages number of Work messages to each worker. In Akka, an 342 actor is a class that extends the trait Actor. The receive method specifies the messages that 343 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:8 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

344 an actor receives and processes. The ActorSystem method creates an actor system with a 345 given name and configuration. The actorOf() method creates an actor with a specified name 346 using a Props configuration object. The invocation of actorOf() on an actor system creates 347 an actor that resides in the actor system. The ! operator sends a fire-and-forget message. 348 In Scala, the construct case allows for case analysis using pattern matching. A case class 349 allows for the construction of immutable objects that can be constructed without new. The 350 val keyword declares an immutable value. 351 However, this program is buggy with an class Master extends Actor { .. 352 Error symptom. The impact of this symp- val sys = ActorSystem(..) .. def receive = { 353 tom is that Master actor cannot calculate case Calc(numWorkers, numMessages, numElements) => { 354 Pi and instead throws a runtime excep- for(i <- 0 until numWorkers){ 355 tion, of type InvalidActorNameException. val worker = sys.actorOf(Props[Worker], "worker") for (j <- 0 until numMessages) 356 An InvalidActorNameException is thrown worker ! Work(0, numElements) } } } .. } 357 when the name of an actor is invalid. This class Worker extends Actor { .. } 358 is because Master attempts to create mul- case class Work(..) { .. } 359 tiple worker actors with the same name Fig. 2. Actor bug [1] with an Error symptom. 360 “worker.” However, in Akka, actor names must be unique. The developer explains this 361 symptom and its impact as: 362 “I am getting an error when trying to create a .. Master Akka actor. I am not sure why I 363 am getting this error: [InvalidActorNameException: actor name worker is not unique!]” 364 An application throwing an out-of-memory exception after it uses all of its available 365 memory, a future throwing a timeout exception after the future is not complete during its 366 timeout period, an actor throwing a transport exception after it sends a message to a remote 367 actor it cannot lookup, and an actor throwing an exception after it cannot process the 368 termination message of the actor it is watching are more examples of our Error symptom. 369 Our Error symptom overlaps with Bianchi et al.’s [32] Crash and Assertion Violation 370 symptoms. They classify the existing actor testing techniques [62, 93, 97] and identify 3 371 symptoms Crash, Deadlock, and Assertion Violation that these techniques use to identify 372 buggy executions at runtime. Their Crash symptom “identifies executions that lead to system 373 crashes.” Similarly, their Assertion Violation symptom is about the violation of “assertions 374 about the correct behavior of a system.” An assertion violation throws an exception. 375 376 4.2 Symptom 2: Unexpected Behavior 377 Unexpected Behavior is the second most common symptom (28.5%) for our actor bugs. The 378 impact of a bug with this symptom is that an actor or its enclosing application does not 379 behave in a way that the developer expects from its implementation. Unexpected behaviors 380 cover a broad spectrum and range from misbehaving schedulers, dispatchers, and supervisor 381 actors to misbehaving loggers and method invocations. 382 Figure3 shows a buggy actor program class Main extends App { .. 383 with an Unexpected Behavior symptom. val sys = ActorSystem(..) 384 Here, the developer intends to write to val fileWriter = sys.actorOf(Props(new FileWriter),..) 385 two files concurrently using a FileWriter Future { fileWriter ! Save (file1, str1) } Future { fileWriter ! Save (file2, str2) } } 386 actor, which receives a Save message. To class FileWriter extends Actor { 387 process it, the actor writes its string str def receive = { 388 to a file with the name file. The Main case Save(file, str) => saveToFile(file, str) } } case class Save(file: String, str:String) 389 application creates fileWriter and sends 390 two Save messages to it for writing strings Fig. 3. Actor bug [11] with a Behavior symptom. 391 str1 and st2 to files file1 and file2, respectively. Main wraps these message sends in future 392 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:9

393 tasks, of type Future, to allow for their concurrent execution. In Scala, an application is a 394 class that extends the App class, which provides the main() entry method. 395 However, this program is buggy with an Unexpected Behavior symptom. The impact of this 396 symptom is that the fileWriter actor does not write to file1 and file2 concurrently but 397 instead sequentially. This is because, in Akka, an actor processes its messages sequentially. 398 The second Save message is processed only after the processing of the first one is finished. 399 The developer explains this symptom and its impact as (punctuation added for clarity): 400 “I am trying to write to multiple files concurrently using the Akka framework. First, I 401 create a[n] [actor] class called FileWriter that writes to a file. Then, using futures, I [send 402 messages to] the [fileWriter actor] twice, hoping that 2 files will be created and [written 403 into] for me [concurrently]. But, when I monitor the execution of the program, it first 404 populates [and writes into] the first file and then the second one [sequentially].” 405 A scheduler not scheduling its message sends after the application increases the number 406 of its actors, an actor resetting its state unexpectedly after processing a message, the sender 407 method returning the wrong value of the self method, and an actor leaking its database 408 connections are more examples of our Unexpected Behavior symptom. sender returns the 409 sender of the current message of an actor. self returns the current actor instance. 410 Our Unexpected Behavior symptom overlaps with Bianchi et al.’s [32] Crash symptom. 411 412 4.3 Symptom 3: Incorrect Messaging 413 Incorrect Messaging is the third most common symptom (25.3%) for our actor bugs. The 414 impact of a bug with this symptom is that an expected message is not sent by an intended 415 sender of the message or not received, stashed, or processed by its intended receiver. The 416 opposite holds for an unexpected message. 417 Figure4 shows a buggy actor program object Starbucks extends App { .. 418 with an Incorrect Messaging symptom. val employees = List(sys.actorOf(Props[Employee], "Penny"),..) 419 Here, the developer intends to route the val customers = List(("Raj", "Machiato"), ..) 420 requests of the customers of the Starbucks val router = sys.actorOf(Props.empty.withRouter 421 coffee shop application to the employees of (SmallestMailboxRouter(routees = employees)) customers foreach{.. 422 the shop. Starbucks creates the router ac- customer => router ! CanIHave(..) } } 423 tor, of type SmallestMailboxRouter, with a class Employee extends Actor { 424 list of employee actors as routees. A router def receive = { case CanIHave(coffee, name) => .. 425 is an actor that receives a message and for- sender.tell(MakeCofee(cofee, name), ..) .. } } 426 wards it to its routees without changing Fig. 4. Actor bug [3] with a Messaging symptom. 427 the sender of the message to itself. For 428 each customer in the customers list, Starbucks sends a CanIHave message to router, and 429 router forwards this message to employees, of type Employee. The Employee receives a 430 CanIHave message, and—to process it—sends a MakeCoffee message back to its sender for the 431 final preparation of the coffee. The developer assumes that the senderof CanIHave is router; 432 therefore, MakeCoffee is sent to router, which forwards this message back to employees. In 433 Akka, the withRouter() method configures a router actor with a given routing strategy. A 434 SmallestMailboxRouter forwards its message to routees with fewer messages in their inboxes. 435 The tell() method sends a fire-and-forget message to its receiver object. 436 However, this program is buggy with an Incorrect Messaging symptom. The impact of 437 this symptom is that the MakeCoffee message is not sent to router; thus, it is not routed 438 to employees. Instead, MakeCoffee is sent to /deadLetters. This is because router does not 439 change the sender of the CanIHave message that it forwards from Starbucks to Employee. 440 Therefore, Starbucks is the sender of CanIHave and not router. However, Starbucks is an 441 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:10 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

442 application and not an actor; thus, sender in Employee evaluates to /deadletters instead of 443 router. The developer explains this symptom and its impact as: 444 “I can’t send a [MakeCoffee message] reply back to the router so that another [Employee] 445 actor in the routing list [employees] can pick this [MakeCoffee message] up . . . when re- 446 plying [to the router], it will give the following message in the console: . . . Message 447 [router.MakeCoffee$] from Actor[akka://actorSystem//user//Penny/#-847662818] to Ac- 448 tor[akka://actorSystem/deadLetters] was not delivered.. dead letters encountered.” The actor name akka://actorSystem//user//Penny/#-847662818 denotes the 449 Penny/#-847662818 actor instance for the employee Penny in employees, which is a 450 user actor in the actorSystem actor system. Similarly, /deadLetters is an actor in 451 actorSystem. Akka distinguishes between user-created and system-created actors. 452 A parent actor not receiving the termination message of its child when it terminates, a 453 mailbox receiving and processing more messages than its specified capacity, an actor losing 454 its stashed messages, and a server actor ignoring messages from its client when they are sent 455 too fast and too frequently are more examples of our Incorrect Messaging symptom. 456 Our Incorrect Messaging symptom is new and cannot be found in Bianchi et al.’s [32] 457 symptoms for actor bugs. 458 459 4.4 Symptom 4: Incorrect Termination 460 Incorrect Termination is the fourth common symptom (8.1%) for our actor bugs. The impact 461 of a bug with this symptom is that an actor or its enclosing application does not terminate, 462 terminates prematurely, terminates and restarts infinitely, or hangs. 463 Figure5 shows a buggy actor program object PureAkka { 464 with an Incorrect Termination symptom. def main(argv : Array[String]) = { 465 val actorSystem = ActorSystem(..) Here, the developer intends to create the 466 val myActor = actorSystem.actorOf(Props(new MyActor { actor myActor, of type MyActor, work with override def receive = { .. } 467 it for some time, and then terminate the override def preStart() = println("prestart") 468 override def postStop() = println("poststop") })) application. The PureAkka application cre- 469 /* work with myActor*/ ates myActor and overrides its receive(), myActor ! PoisonPill } } 470 preStart(), and postStop() methods. Af- class MyActor extends Actor { .. } 471 ter working with myActor, PureAkka shuts Fig. 5. Actor bug [6] with a Termination symptom. 472 down the actor. The preStart() and postStop() life cycle methods are invoked by Akka 473 automatically before an actor starts its execution and after it terminates, respectively. The 474 PoisonPill message terminates the actor that receives and subsequently processes it. 475 However, this program is buggy with an Incorrect Termination symptom. The impact 476 of this symptom is that the PureAkka application continues its execution and does not 477 terminate even after the termination of the myActor actor. This is because terminating 478 myActor does not terminate either its enclosing actor system actorSystem nor the PureAkka 479 application in which it executes. The invocation of preStart()—before myActor starts its 480 execution—prints “preStart” on the console. Similarly, the invocation of postStop()—after 481 myActor stops its execution—prints “postStop.” However, both actorSystem and PureAkka 482 continue their executions after the termination of myActor. The developer explains this 483 symptom and its impact as: 484 “I’ve written . . . code that starts [myActor], kills it and finishes execution. This code prints 485 [to the console]: ’[info] prestart’ ’[info] poststop.’ But, [the PureAkka application] refuses to 486 stop until I kill the process with CTRL-C. What does the application wait for?” 487 An actor not terminating after processing a message that throws an exception, an ap- 488 plication hanging after increasing the number of messages that its actors should process, 489 a parent actor waiting on the termination of its children and never terminating, and an 490 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:11

491 application deadlocking when shutting down its actor system are more examples of our 492 Incorrect Termination symptom. 493 Our Incorrect Termination symptom overlaps with Bianchi et al.’s [32] Deadlock symptom. 494 Their Deadlock symptom “identifies [buggy] executions that lead to deadlocks.” 495 496 4.5 Symptom 5: Incorrect Exceptions 497 Incorrect Exceptions is the least common symptom (2.2%) for our actor bugs. The impact 498 of a bug with this symptom is that an intended actor does not throw, catch, or properly 499 handle an expected exception. The opposite holds for an unexpected exception. 500 Figure6 shows a buggy actor program object app extends App { 501 with an Incorrect Exceptions symptom. try{ 502 var act = Here, the developer intends to create the context.actorOf(Props(classOf[MyActor],..)..)} 503 act actor, of type MyActor, and catch and catch{ case e: UserExc => println("failed!") } } 504 handle the exception that its creation may class MyActor(..) extends Actor { 505 def this(..) { throw new UserExc(..) } throw. The app application creates act def receive = { .. } .. } 506 and surrounds its construction with a try- class UserExc extends Exception { .. } 507 catch to catch and handle its UserExc ex- Fig. 6. Actor bug [9] with an Exceptions symptom. 508 ception by printing “failed!” to the console. MyActor defines the constructor this() that 509 throws UserExc. In Akka, the context variable is the context of the current actor. The 510 invocation of actorOf() on context creates an actor as the child of the current actor. 511 However, this program is buggy with an Incorrect Exceptions symptom. The impact of this 512 symptom is that the exception UseExc is not caught in the application app. This is because, 513 in Akka, an exception that an actor throws—during its creation or message processing—is 514 caught and handled by its parent actor. Here, the parent of act is the actor /user and not 515 app. In Akka, /user is the parent of all user actors. The developer explains this symptom 516 and its impact as: 517 “The problem I’m having is that it seems like, in Akka, the context.actorOf() call [in the 518 application app] isn’t actually creating the MyActor object [act] itself [in the same thread 519 that app runs on], but deferring it to another thread. So when the constructor [of MyActor] 520 throws an exception, the try-catch block that I put in has no effect.” 521 A sender actor not being able to catch the exception that its receiver actor throws when 522 processing its messages, a child actor not being able to log the exception it throws after its 523 parent actor is terminated, and a test application not being able to catch the exception that 524 its actor under test throws are more examples of our Incorrect Exceptions symptom. 525 Our Incorrect Exceptions symptom is new and cannot be found in Bianchi et al.’s [32] 526 symptoms for actor bugs. 527 528 4.6 Implications 529 530 Using our new symptoms to extend the set of buggy executions to test for and identify 531 Bianchi et al. [32] classify Bita [97], Basset [62] and the work by Sen and Agha [93] as bug 532 finding tools and techniques for actor software. These works generate inputs, messages and 533 their orders to run the code and use Crash, Deadlock, and Assertion Violation symptoms—as 534 testing oracles—to identify buggy executions and bugs. For example, Bita finds Akka bugs 535 with Crash and Assertion Violations symptoms. Our Error, Unexpected Behavior, and 536 Incorrect Termination symptoms overlap with Bianchi et al.’s symptoms. However, our 537 Incorrect Messaging and Incorrect Exceptions symptoms, accounting for 27.5% of our actor 538 bugs, are new. Previous and future actor bug finding tools and techniques can use our new 539 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:12 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

540 symptoms to extend the set of executions that they consider as buggy and therefore increase 541 the number of bugs that they may find. 542 543 5 ROOT CAUSES 544 In this section, we answer RQ2 and discuss the root causes of our actor bugs, their clas- 545 sification and commonalities, and provide real-world examples of actor bugs with these 546 root causes and their fixes. We also compare our root causes with those of actor bugsfrom 547 previous work by Hedden and Zhao [51] and Torres Lopez et al. [77]. 36 548 Figure7 shows the root causes of our ac- 35

549 tor bugs, their numbers, and commonalities 30 27 550 in terms of percentages. According to this 26 25 23 23

551 figure, the root causes of our actor bugs can average = 18.6 20 18

552 be classified into the following ten categories, 15 14 Number of Number bugs

553 10 with decreasing commonalities: Logic, Race, 10 7

554 2 API Confusion, Explicit Life cycle, Program- 5 555 14.0 9.7 12.4 19.4% 14.6 3.8 1.1% 5.4 12.4 ming, Messaging Patterns, Model Confusion, 0 7.6 556 Misnaming, Misconfiguration, and Untyped Logic Race API Life Prog Patterns Model Naming Config Untype Root causes 557 Communication. Among these, Logic is the Fig. 7. Root causes of our actor bugs. 558 most common root cause (19.4%) and Un- 559 typed Communication is the least common (1.1%). 560 561 5.1 Root Cause 1: Logic 562 Logic is the most common root cause (19.4%) for our actor bugs. A program is a set of steps 563 to implement a logic that transforms the input of the program to its desired output and 564 effects. Akka developers are responsible for the correct implementation of the logic oftheir 565 programs. Otherwise, incorrect implementation of the logic can cause bugs. 566 Figure8 shows a buggy actor program bug 567 with a Logic root cause. Here, the developer class TcpConnection(..) extends Actor .. { .. val channel:SocketChannel = .. 568 intends to accept a TCP connection—in def receive: Receive { 569 the TcpConnection actor—and transfer data case Write(data,..) => 570 over this connection by writing it to a net- /* write data into channel*/} class Pend(rem:ByteString, buffer:ByteBuffer,..)..{ 571 work socket channel of type SocketChannel. def writeToChannel(data:ByteString): Pend = { .. 572 TcpConnection receives a message Write and, channel.write(buffer) .. 573 to process it, writes its data to channel. if (buffer.hasRemaining) { if (data eq rem) this 574 Writing to a SocketChannel requires a else new Pend(data, buffer, ..) } 575 buffer to sit between the channel and the else if (data.nonEmpty) { .. 576 TcpConnection. The data is copied to the val copied = rem.copyToBuffer(buffer) .. writeToChannel(rem drop copied) } .. } .. } 577 buffer and then written to the channel from fix 578 the buffer. Depending on the sizes of the val copied = data.copyToBuffer(buffer) .. 579 buffer, data, and channel, there are no guar- writeToChannel(data drop copied) } .. } .. } 580 antees that the data can be copied to the Fig. 8. Actor bug [8] with a Logic root cause. 581 buffer or that the buffer can be written into the channel fully at once. Therefore, aloop 582 that attempts to copy and write some data to and from the buffer to the channel, track 583 uncopied and unwritten data, and copy and write them in the next attempt is needed. 584 The writeToChannel() recursive method of the pend class copy data to buffer, write buffer 585 to channel, and track the unwritten data rem of the buffer. To write data to channel, 586 writeToChannel() first attempts to write the unwritten data of buffer, from its previous 587 invocation, to channel. Afterwards, it checks if there is any unwritten data remaining in 588 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:13

589 the buffer. If there is, writeToChannel() returns the receiver this of its current invocation if 590 data is old and is the same as rem that was not written in its previous invocation. Otherwise, 591 writeToChannel() creates a new Pend object to for writing data that is new and pending to 592 be written to the channel later. If there is no unwritten data remaining in the buffer and 593 data is not empty, writeToChannel() attempts to copy rem to the buffer and invokes itself 594 with any data in rem that cannot be copied. 595 However, this program is buggy with an Unexpected Behavior symptom and a Logic 596 root cause. The impact of this symptom is that larger data is broken into smaller parts 597 for copying and writing; however, these parts become mixed and garbled such that the 598 original and transferred data differ. The developer explains this symptom as, “Tcp.Writes 599 get garbled . . . if a write [to channel] is [larger] than 300k”. The root cause of this bug is 600 that the developer implements the logic for writing the data to the channel incorrectly and 601 copies rem to the buffer, instead of data, although rem will be written to the channel using 602 this. Similarly, the developer invokes writeToChannel() recursively with any uncopied part 603 of rem instead of data. This causes rem—or part of it—to be written into the channel more 604 than once or data—or part of it—not to be written at all. The fix for this bug, as shown in 605 Figure8, suggests copying data—and not rem—to the buffer and invoking writeToChannel() 606 using the uncopied part of data—and not rem. In Scala, copyToBuffer() copies data into a 607 buffer and returns the number of copied bytes. The drop() method removes the first n bytes 608 of a ByteString. 609 Our Logic root cause overlaps with Hedden and Zhao’s Logic root cause [51]. They 610 study 126 Akka actor bugs in 12 projects from the ScalaIndex website [91] and classify 611 their root causes into three categories and ten subcategories, with subcategories inside 612 parentheses: Logic, Communication (Response, Connection, Error Handling, Message Order), 613 and Coordination (Cooperation, Shutdown, Recovery, Workload, Operation Order, Creation). 614 Their Logic bugs “range from common null pointer errors, optimization issues for buffers 615 . . . and any other number of bugs a program must solve during development.” In addition, 616 our observation that Logic is the most common root cause for our actor bugs agrees with 617 Hedden and Zhao’s observation that “Communication and Coordination [bugs together] 618 occur less than common Logic bugs.” However, their Logic bugs are 57.9% of their bugs, 619 whereas our Logic bugs are only 19.4% of our actor bugs. 620 621 5.2 Root Cause 2: Race 622 Race is the second most common root cause (14.6%) for our actor bugs. A race happens when 623 two conflicting computations have different execution and program orders. Two computations 624 conflict if they access the same memory region concurrently and at least one of themwrites 625 to the region. Execution and program orders specify the orders in which computations 626 execute at run time and occur in the program code at compile time. Akka developers are 627 responsible to write code that is free from races. Otherwise, races can cause bugs. 628 Figure9 shows a buggy actor program [ 14] with a Race root cause. Here, in the actor 629 DownloadImageActor, the developer intends to asynchronously download an image and inform 630 the actor that has requested this download of its success or failure. DownloadImageActor 631 receives a DownloadImage message from a sender actor, say 퐴, and to process it invokes 632 the asynchronous method downloadImage() of imageDownloadService. This asynchronous 633 invocation means that DownloadImageActor does not block and wait for the completion 634 of the invocation result. Instead, the invocation returns immediately with an incomplete 635 future as the placeholder for its result. The future is complete when the invocation finishes 636 its execution. To process the result of the invocation, DownloadImageActor registers an 637 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:14 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

638 onComplete() callback to be invoked when the future is complete. The callback sends a 639 ImageDownloadSuccess message back to the sender 퐴 if the value of the future is a Success. 640 However, this program is buggy with bug 641 an Incorrect Messaging symptom and a class DownloadImageActor(..) extends Actor .. { .. override def receive: Receive = { 642 Race root cause. The impact of this symp- case DownloadImage(jobId, imageUrl) => 643 tom is that the sender 퐴 does not receive imageDownloadService.downloadImage().onComplete { 644 the ImageDownloadSuccess message, and in- case Success(image) => sender() ! ImageDownloadSuccess(..) 645 stead this message is sent to /deadLetters case Failure (..) => .. } } } 646 actor. The developer explains this symp- fix 647 tom as “deadletters encountered.” The root val client = sender() 648 cause of this bug is that there is a race be- imageDownloadService.downloadImage().onComplete { case Success(image) => 649 tween accesses to the value of the sender ac- client ! ImageDownloadSuccess(..) 650 tor. downloadImage and DownloadImageActor Fig. 9. Actor bug [14] with a Race root cause. 651 could run concurrently and on two different 652 threads. By the time downloadImage is complete, DownloadImageActor could have received 653 a new message from another sender, say 퐵, that is different from the original sender 퐴. 654 Therefore, sender() evaluates to 퐵 and ImageDownloadSuccess is sent to 퐵, instead of 퐴. 655 Here, not only 퐵 is the wrong sender but also by the time ImageDownloadSuccess is sent to 656 it, 퐵 does not exists anymore to receive the message, maybe due to termination. Therefore, 657 ImageDownloadSuccess cannot be delivered to 퐵 and instead is delivered to /deadLetters. 658 The fix for this bug, as shown in Figure9, suggests invoking sender() outside onComplete(), 659 instead of inside, assigning its value to the variable client, and send ImageDownloadSuccess 660 to client, instead of sender(). 661 This bug and its incorrect way of using the mixture of actors and futures is a good example 662 of the misuse of the mixture of concurrency models [96]. These mixtures are often necessary 663 for the implementation of real-world actor software [96, 98], however their misuse can cause 664 bugs. Tasharofi et al. [98] study 15 large and mature Scala Akka software and observe that 665 “80% of them mix actor model with another concurrency model [such as multithreaded 666 concurrency]” where “mixtures of Actor[s] and Future[s] are common.” 667 The race in Figure9 is a simple bug with a well-known anti-pattern of closing over the 668 mutable sender() in the callback of a future. This anti-pattern is well-documented in several 669 places, such as Akka documentation [71], books [59, 65], and blogs [80]. However, both Stack 670 Overflow and GitHub developers still write buggy actor code with this race astheroot 671 cause. In fact, 29.6% of our Race bugs close over sender(), in the call back of a future or a 672 scheduled message or a task. The anti-pattern behind these bugs is syntactic and can be 673 found using a simple static analysis. 674 Races between messages that are sent concurrently to the same actor [28, 97], between 675 asynchronous life cycle actions of the actor, such as creation, initialization, lookup, restart, 676 and termination, and between messages and life cycle actions are more examples of our Race 677 root cause. There are actor frameworks and languages that are less susceptible to our Race 678 bugs. For example, Orleans [82] supports more implicit life cycle management and is less 679 susceptible to races between life cycle actions and messages. Similarly, Erlang [24] prevents 680 sharing among its actors and is less susceptible to races between accesses to shared data. 681 Our Race root cause overlaps with Hedden and Zhao’s [51] Message Order, Operation 682 Order, and Cooperation root causes. Their Message Order bugs occur when a “developer 683 expects messages to arrive in a certain order instead of planning around the non-deterministic 684 nature of actor messaging.” Similarly, the reason for their Operation Order bugs is that “much 685 like message ordering, there are sometimes issues that occur as a result of the [incorrect] 686 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:15

687 order of a job being carried out on a system.” Finally, their Cooperation bugs occur because 688 of the “problems occurring from actors performing or existing simultaneously [concurrently].” 689 In addition, our Race overlaps with Torres Lopez et al.’s [77] Bad Message Interleaving and 690 Memory Inconsistency root causes. They study 23 actor bugs in various actor models from 691 11 previous works and classify their root causes into 2 categories and 6 subcategories, with 692 subcategories inside parentheses: Lack of Progress (Communication Deadlock, Behavioral 693 Deadlock, and Livelock) and Message Protocol Violation (Message Order Violation, Bad 694 Message Interleaving, and Memory Inconsistency). Their Bad Message Interleaving bugs 695 occur when “a message is processed between two messages which are expected to be processed 696 one after the other.” Similarly, their Memory Inconsistency bugs occur when “different actors 697 have inconsistent views of shared resources.” 698 699 5.3 Root Cause 3: API Confusion 700 API Confusion is the third most common root cause (14.0%) for our actor bugs. Akka API 701 provides a large number of 1,438 public classes with 41,554 methods [21]. In comparison, 702 the general-purpose Scala provides only 491 classes with 38,334 703 methods [90]. Akka API classes support a broad spectrum of actor functionalities ranging 704 from untyped to typed actors, remoting to clustering, and supervision to routing. The syntax, 705 semantics, and usage constraints for these classes could differ significantly, even for similar 706 functionalities. For example, unlike any other supervisor actor that by default restarts 707 its children on failure, BackoffSupervisor stops and starts its children. In addition, these 708 differences can exist for similar functionalities in different versions of Akka. Akka developers 709 are responsible to understand and satisfy syntax, semantics, and usage constraints of Akka 710 API classes. Otherwise, confusions can cause bugs. 711 Figure 10 shows a buggy actor program bug 712 [15] with an API Confusion root cause. class SenderActor() extends Actor { 713 Here, the developer intends for the super- override preRestart(..): Unit = { .. /* senda message to supervisor*/} 714 visor actor BackoffSupervisor to restart override def receive: Receive = { 715 its child actor SenderActor when it throws case cmd: .. => throw new MsgExc(..) } .. } 716 an exception and fails. SenderActor over- class Main extends App { val childProp = Props(new SenderActor()) 717 rides its preRestart() method to send a val supProp = 718 failure message to its supervisor. After- BackoffSupervisor.props(Backoff.onFailure(childProp,..) 719 wards, it receives a cmd message and to pro- .withSupervisorStrategy(OneForOneStrategy(..) { case m: MsgExc => { SupervisorStrategy.Restart}..})) 720 cess it throws a MsgExc exception. Throw- val sup = context.actorOf(supProps) 721 ing this exception should cause the par- sup ! cmd } 722 ent BackoffSupervisor to restart its child fix BackoffSupervisor.props(Backoff.onStop(childProp,..) 723 SenderActor. The Main application creates 724 the configuration objects childProps and Fig. 10. Actor bug [15] with an API root cause. 725 supProps. A configuration object specifies options that are used in the creation ofanac- 726 tor. childProps configures a child actor, of type SenderActor, and supProps configures a 727 BackoffSupervisor with a supervision strategy OneForOneStrategy for exception handling. 728 Main creates the supervisor actor sup with the configuration supProps and sends a cmd message 729 to it. sup forwards this message to its child which fails during its processing. In Akka, a 730 BackoffSupervisor restarts its child whenever it throws an exception, but each time with an 731 increasing delay. OneForOneStrategy of a parent applies only to the child that fails and leaves 732 other children intact. PreRestart is invoked by Akka automatically before an actor restarts. 733 However, this program is buggy with an Unexpected Behavior symptom and an API 734 Confusion root cause. The impact of this symptom is that the failed child SenderActor does 735 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:16 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

736 not restart and its preRestart() is not invoked. The developer explains this symptom as “I 737 attempted to resend a message to the [BackoffSupervisor] actor on preRestart() hook [of 738 SenderActor], but somehow the hook is not being triggered.” The root cause of this bug is 739 that the developer does not know that BackoffSupervisor and its Backoff.OnFailure method 740 are significantly different from other supervisors in Akka. Backoff.OnFailure does not restart 741 SenderActor and instead stops and starts it and thus its preRestart() is never invoked. The 742 fix for this bug, as shown in Figure 10, suggests using Backoff.OnStop to restart SenderActor, 743 instead of Backoff.OnFailure. 744 This bug, similar to the bug in Figure9, is well-documented in Akka API documentation 745 [70]. The documentation says “note that this supervision strategy [onFailure()] does not 746 restart the actor but rather stops and starts it. The preRestart() hook will not be executed 747 if the supervised actor fails or stops.” However, we observe that the developer still writes 748 this buggy actor code with this API confusion as the root cause. 749 Our API Confusion root cause is new and cannot be found in Hedden and Zhao’s [51] or 750 Torres Lopez et al.’s [77] root causes for actor bugs. 751 752 5.4 Root Cause 4: Explicit Life cycle 753 Explicit Life cycle is the fourth most common root cause (12.4%) for our actor bugs. In Akka, 754 an actor goes through different life cycle phases, such as creation, initialization, lookup, 755 monitoring, termination, and restart. The life cycle of the actor should be managed explicitly 756 and manually. In addition, the explicit life cycle management is combined with the implicit 757 life cycle management. Unlike explicit life cycle management that is implemented by and is 758 visible to the developer, the implicit and automatic life cycle management is implemented by 759 Akka and is invisible to the developer. For example, Akka invokes life cycle methods preStop 760 and postStop of an actor implicitly right before and after it shuts down the actor. Finally, 761 explicit and implicit life cycle managements are combined with features such as parental 762 supervision and actor systems. For example, a parent actor terminates only after all of its 763 children terminate. Akka developers are responsible to understand and correctly manage 764 these life cycles explicitly and manually. Otherwise mismanagements can cause bugs. 765 Figure 11 shows a buggy actor program bug 766 [7] with an Explicit Life cycle root cause. class Parent extends Actor { .. val c = context.actorOf(Props[Child]) 767 Here, the developer intends to perform some override def postStop { 768 cleanup in the child actor Child before log.info("Shutdown .. Sending message to child..") 769 the termination of its parent actor Parent val future = c ? "doCleanup" log.info("Waiting for child to complete task..") 770 stops and terminates the child. cre- Parent Await.result(future, Duration.Inf) 771 ates its child c. Afterwards, it overrides context.stop(c) 772 its postStop() method to send the message log.info("Child stopped. Stopping self. Bye!") context.stop(self)}} 773 "doCleanup" to c and blocks and waits for its class Child extends Actor { .. } 774 result future. Parent continues by stopping fix 775 and terminating c and then terminating it- class Child extends Actor { .. 776 self. In Akka, the ? operator sends a request- override def postStop { //move child cleanup from ParentActor to here} 777 reply message. The result() method blocks Fig. 11. Actor bug [7] with a Life cycle root cause. 778 on a future for a timeout period and returns 779 the value of the future if it completes during this period. Otherwise, it times out and throws 780 an exception. The stop() method terminates an actor. The self variable refers to the current 781 actor instance. 782 However, this program is buggy with an Error symptom and an Explicit Life cycle root 783 cause. The impact of this symptom is that the child c does not do any cleanup. The developer 784 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:17

785 explains this symptom as, “[ERROR] Recipient Actor[akka:..Parent/Child#70868360] had 786 already been terminated.” The root cause of this bug is that the developer misunderstands 787 the explicit and implicit life cycle managements for Parent and c and their combination 788 with the parental supervision of c by Parent. A parent actor stops only after all its children 789 terminate and not before. Therefore, when Akka invokes postStop() of Parent the child c is 790 already stopped and cannot receive and process “doCleanup()”. The fix for this bug suggests 791 overriding postStop() of Child and perform the cleanup of the child in there, instead of in 792 postStop() of Parent. In addition to this bug, there is another bug in this program that 793 causes Parent to block forever. This is because Parent sends a request-reply message to c and 794 waits for its reply for the infinity duration of Duration.Inf. However, c is already stopped 795 and never sends a reply back. Finally, the program in Figure5 is another example of an 796 actor bug with an Explicit Life cycle root cause. Here, the developer explicitly terminates 797 the actor myActor but forgets to explicitly terminate its enclosing actor system actorSystem. 798 Missing the explicit starting of remote actors before messaging them, explicit creation of 799 actors before looking them up, and explicit termination of actors and actor systems before 800 termination of their enclosing applications, are more examples of our Explicit Life cycle 801 root cause. There are actor frameworks and languages that are less susceptible to our 802 Explicit Life cycle bugs. For example, Orleans [82] supports more implicit and automatic 803 actor life cycle management and is less susceptible to bugs that are due to explicit and 804 manual mismanagements of life cycle of actors. 805 Our Explicit Life cycle root cause overlaps with Hedden and Zhao’s [51] Creation, Shutdown, 806 and Recovery root causes. Their Creation bugs “occur as a result of this [incorrect] actor 807 creation . . . [that] can lead to errors if improperly implemented.” Their Shutdown bugs are 808 bugs “involving a problematic shutdown process [since] every system must shut down at 809 some point and should do so gracefully.” Their Recovery bugs occur due to “unexpected 810 variables or events involved during this [actor] recreation process [that] if not anticipated 811 would lead to errors [because] the actor model is built to allow actors that fail to recover 812 . . . by having its master [supervisor] recreate it.” 813 814 5.5 Root Cause 5: Programming 815 Programming is the fifth most common root cause (12.4%) for our actor bugs. Aprogram 816 should satisfy the syntactic and semantics requirements of the programming language that 817 is used to write it. Otherwise, violations could cause bugs. 818 Incorrect dependencies and imports, recreation of singleton objects, matching against 819 erased type variables, and confusing classes with identical unqualified names that belong to 820 different packages are examples of our Programming root cause. 821 Most (60.9%) of our Programming bugs are compile-time bugs and can be detected by 822 compilers statically whereas the rest (39.1%) are run-time bugs. 823 Our Programming root cause overlaps with Hedden and Zhao’s [51] Logic root cause. 824 825 5.6 Root Cause 6: Messaging Patterns 826 827 Messaging Patterns is the sixth most common root cause (9.7%) for our actor bugs. In 828 Akka, an actor can use several messaging patterns for sending and receiving messages. These 829 patterns, such as request-response, forward, and route, can have complex semantics. For 830 example, for a request-response message Akka creates an additional actor, that is invisible to 831 the developer, to process the future reply of the message. In addition, the complex semantics 832 of these patterns can be combined with the semantics of other features of Akka, such as 833 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:18 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

834 parental supervision and exception handling. Akka developers are responsible to understand 835 these complex semantics to use these patterns correctly. Otherwise, misuses can cause bugs. 836 Figure 12 shows a buggy actor program bug 837 [13] with a Messaging Patterns root cause. class Deletion extends Actor { .. override val supervisorStrategy = 838 Here, the developer intends for the parent OneForOneStrategy(..){case _:Exception => Restart} 839 actor Deletion to send a delete message to val es = context.actorOf(Props[ES]..) .. 840 its child ES and either receive a reply for val f = ask(es, DeleteFromES(..)) val isDel = Await.result(f, timeout.duration) ..}} 841 the successful deletion or restart the child classES extends Actor { .. 842 if it throws an exception and fails. Deletion override def preRestart() { .. } .. 843 overrides its supervision strategy to restart override def postRestart() { .. } .. def receive = { 844 its child when it throws an exception and case DeleteFromES(..) => 845 creates the child actor es, of type ES. Af- throw new Exception("..") ..} } 846 terwards, it sends a request-reply message fix 847 DeleteFromES to es and blocks on its future f.onComplete { case Success (..) => .. 848 reply f for the duration of timeout. ES over- case Failure(..) => .. } 849 rides its preRestart() and postRestart() Fig. 12. Actor bug [13] with a Patterns root cause. 850 methods. Afterwards, it receives the mes- 851 sage DeleteFromES and to process it throws an exception, of type Exception. Throwing this 852 exception should cause the parent Deletion to restart the child ES. In Akka, the ask method 853 sends a request-reply message. 854 However, this program is buggy with an Unexpected Behavior symptom and a Messaging 855 Patterns root cause. The impact of this symptom is that the child es does not restart and 856 its preRestart() and postRestart() methods are not invoked. The developer explains this 857 symptom as, “postRestart() and preRestart() methods [of es] are not getting invoked.” 858 The root cause of this bug is that the developer misunderstands the semantics of the 859 request-response pattern and its combinations with the semantics of parental supervision 860 and exception handling. The parent Deletion sends a request-response message and blocks 861 and waits for some time for the reply from the child es. However, es does not send the reply 862 and instead throws an exception, that is stored in f, and fails. es may throw its exception 863 before or after Deletion’s wait is over. For before, Deletion is still blocked and cannot restart 864 es. For after, Deletion throws a timeout exception and fails itself and cannot restart es. 865 The fix for this bug, as show in Figure 12, suggests using the callback onComplete() on f, to 866 identify if the result of the request-reply message is a success or a failure. 867 Our Messaging Patterns root cause overlaps with Hedden and Zhao’s [51] Response root 868 cause. Their Response bugs occur due to “improper responses to different communication- 869 based operations [messages].” 870 871 5.7 Root Cause 7: Model Confusion 872 Model Confusion is the seventh common root cause (7.6%) for our actor bugs. The com- 873 putation model of Akka actor concurrency is significantly different from multithreaded 874 concurrency, as discussed in Section3. In addition, well-known and basic functionalities may 875 have significantly different semantics in these models. For example, parental supervision 876 and exception handling for Akka actors are significantly different from standard exception 877 handling for Scala threads. Akka developers are responsible to clearly understand Akka and 878 its underlying computational model. Otherwise, confusions can cause bugs. 879 Figure 13 shows a buggy actor program [2] with a Model Confusion root cause. Here, the 880 developer intends to evaluate the performance of a LoadWorker router. The LoadGenerator 881 actor creates the router actor loadRouter, of type LoadWorker, with a round robin routing 882 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:19

883 strategy. Afterwards, it receives a "start" message and to process it generates an infinite 884 number of random messages and sends them to loadRouter.A RoundRobinRouter forwards 885 its messages to its routees one after another. 886 However, this program is buggy with bug 887 an Incorrect Termination symptom and a class LoadGenerator(..) extends Actor { .. val loadRouter = context.actorOf(Props[LoadWorker] 888 Model Confusion root cause. The impact of .withRouter(RoundRobinRouter(..)..)) 889 this symptom is that the LoadGenerator ac- def receive = { 890 tor cannot evaluate the performance of the case "start" => .. while(true) 891 LoadWorker router and instead hangs. The loadRouter ! r.getRandomCommand() } } ..} 892 developer explains this symptom as, “actor class LoadWorker extends Actor { .. } 893 app hangs under high volume.” The root fix 894 cause of this bug is that the developer is (1 to batchSize) foreach{ loadRouter ! r.getRandomCommand() } 895 confused about the basics of Akka actor self ! "start" 896 model and its message processing semantics. Fig. 13. Actor bug [2] with a Model root cause. 897 To process "start," LoadGenertor sends mes- 898 sages to loadRouter in an infinite loop. Therefore, to process its first“start” and the messages 899 it receives afterwards, LoadGenertor should be able to process its messages concurrently 900 and more than one message at a time. However, in Akka, an actor processes its messages 901 sequentially, one after another. Therefore, LoadGenerator falls into an infinite loop when 902 processing its first “start,” never finishes its processing, and never starts the processing of 903 its next messages. The fix for this bug, as shown in Figure 13, suggests using self-messaging 904 to send the “start” messages in batches with the size batchSize, instead of an infinite loop. 905 In addition, the program in Figure3 is another example of a bug with a Model Confusion 906 root cause. Here, the developer is similarly confused about the sequential processing of 907 messages in the FileWriter actor. 908 Sending a message from a non-actor entity to an actor, invoking a method of an actor 909 instead of sending it a message, and sending a message to an actor Actor instead of its 910 reference ActorRef, are more examples of our Model Confusion bugs. Akka separates an 911 actor, of type Actor, and its reference, of type ActorRef, and makes the actor accessible only 912 through its reference to make states of the actor accessible by its messages. Note that this 913 separation still cannot guarantee that the state of the actor is not shared with other actors. 914 There are actor models that are less susceptible to our Model Confusion bugs. For example, 915 academic actor models that support Parallel Actor Monitors [92] and transactional message 916 processing [50], allow for concurrent message processing and are less susceptible to bugs 917 related to sequential message processing. However, industrial-strength actor frameworks and 918 languages, such as Akka, Orleans [82], and Erlang [24], processes their messages sequentially. 919 Our Model Confusion root cause is new and cannot be found in Hedden and Zhao’s [51] 920 or Torres Lopez et al.’s [77] root causes for actor bugs. 921 922 5.8 Root Cause 8: Misnaming 923 Misnaming is the eighth most common symptom (5.4%) for our actor bugs. An actor name 924 includes several properties of the actor, such as its proper name, enclosing actor system, 925 user or system actor, supervision hierarchy, network protocol, and port number. Akka actor 926 developers are responsible to manually provide and maintain correct actor names. Otherwise, 927 misnamings can cause bugs. 928 Figure 14 shows a buggy actor program [5] with a Misnaming root cause. Here, the 929 developer intends to configure the server application Server and its server actor db for the 930 remote communication with the client application Client. Server configures and creates the 931 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:20 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

932 system actor system with the name IMDB and the configuration config. IMDB and its actors 933 are configured to run on a machine with the IP address 127.0.0.1 and at the port 2253. IMDB 934 uses TCP protocol from Netty’s client-server framework [86] for remote messaging. Server 935 creates the server actor db with the name ImdbActor to reside in the IMDB actor system. 936 Client looks up dbs using the name [email protected]:2552/user/ImdbActor and sends it a 937 message. In this name, ImdbActor is the name of the server actor that resides in the AkkaIMDB 938 actor system running on a machine with the IP address 127.0.0.1 and at the port 2552. 939 AkkaIMDB is a user actor. In Akka, the actorSelection method looks up an actor with a 940 specific name. In Scala, the parseString method parses a configuration string. 941 However, this program is buggy with an bug 942 Incorrect Messaging symptom and a Mis- classDB extends Actor { .. } object Server extends App { 943 naming root cause. The impact of this symp- val config = ConfigFactory.parseString("..remote { .. 944 tom is that the Client application cannot netty.tcp{ hostname="127.0.0.1" port=2253 }..}..") 945 communicate with the remote Server and its val data = ConfigFactory.load(config) val system = ActorSystem("IMDB", data) 946 actor db. The developer explains this symp- val db = system.actorOf(Props(new DB),"ImdbActor") } 947 tom as, “when I put the database actor [db] object Client extends App { 948 on a remote actor system . . . , I have the er- val dbs = system.actorSelection (s"[email protected]:2552/user/ImdbActor") 949 ror deadletters encountered.” The root cause (dbs ? messages.Set(key , value)).map(..) .. } 950 of this bug is that the developer uses the fix 951 incorrect name for db with the wrong port val dbs = system.actorSelection 952 number 2252, instead of 2253. db is not run- (s"[email protected]:2553/user/ImdbActor") 953 ning at the port 2253 and thus cannot be Fig. 14. Actor bug [5] with Misnaming root cause. 954 looked up and the messages sent to it are delivered to /deadLetters. The fix for this bug, 955 as shown in Figure 14, suggests using the name [email protected]:2553/user/ImdbActor 956 for dbs, instead of [email protected]:2552/user/ImdbActor. In addition, the program in 957 Figure2 is another example of a bug with a Misnaming root cause. Here, the developer 958 attempts to create multiple Worker actors with the same non-unique name “worker,” which 959 is not allowed. 960 Actor names with incorrect parental hierarchies, incorrect letter cases for characters, and 961 incorrect network protocols are more examples of our Misnaming root cause. 962 Our Misnaming root cause is new and cannot be found in Hedden and Zhao’s [51] or 963 Torres Lopez et al.’s [77] root causes for actor bugs. 964 965 5.9 Root Cause 9: Misconfiguration 966 Misconfiguration is the ninth most common root cause (3.8%) for our actor bugs. The default 967 Akka configuration [22] provides a large number of 255 parameters to configure actors and 968 their dispatching, supervision, shutdown, routing, mailbox, clustering, remoting, logging, 969 serialization, and deployment. About 42.4% of these parameters need values of complex data 970 types, such as API class names, values of other configuration parameters, and arrays of these 971 values. In addition to the default configuration, developers can declare and use their own 972 custom configurations parameters. There are consistency constraints on the values of these 973 parameters. For example, a PinnedDispatcher cannot be configured to use an executor that 974 is not a thread-pool-executor [22]. Akka developers are responsible to manually discover 975 and satisfy these constraints. Otherwise, misconfigurations can cause bugs. 976 Figure 15 shows a buggy actor configuration4 [ ] with a Misconfiguration root cause. Here, 977 the developer intends to configure the custom dispatcher blocking-dispatcher and create an 978 actor that uses this dispatcher, in the Main application. In Akka, all actors share the same 979 default dispatcher, however, a developer can configure and use their own dispatcher. The 980 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:21

981 configuration file application.conf declares blocking-dispatcher to be a dispatcher of type 982 PinnedDispatcher with an executor of type thread-pool-executor. The executor maintains a 983 thread pool with a minimum of 2*2 threads (core-pool-size-min * core-pool-size-factor) 984 and a maximum of 10*2 (core-pool-size-max * core-pool-size-factor). 985 However, this configuration is buggy with bug 986 an Error symptom and a Misconfiguration /* part of application.conf*/ blocking-dispatcher { 987 root cause. The impact of this symptom type = PinnedDispatcher 988 is that an actor with the configuration executor = "thread-pool-executor" 989 blocking-dispatcher cannot be created. thread-pool-executor { core-pool-size-min = 2 990 The developer explains this symptom core-pool-size-factor = 2.0 991 as, “when I create [an] actor [with the core-pool-size-max = 10 } .. } 992 configuration blocking-dispatcher] I’m object Main extends App { .. ..actorOf(..withDispatcher("blocking-dispatcher")..)..} 993 getting the exception: Exception in thread fix 994 ’main’ akka.ConfigurationException: core-pool-size-min = 1 995 Dispatcher [blocking-dispatcher].” A core-pool-size-factor = 1 996 ConfigurationException is thrown when core-pool-size-max = 1 997 there is a problem with a configuration. Fig. 15. Actor bug [4] with a Misconfig root cause. 998 The root cause of this bug is that the developer is not satisfying the consistency constraints 999 for the configuration of PinnedDispatcher. PinnedDispatcher assigns a single thread to each 1000 actor and thus each actor has its own thread pool with only one thread in it. Therefore, 1001 core-pool-size-min, core-pool-size-max, and core-pool-size-factor should be 1, and not 1002 2, 10, and 2.0. The fix for this bug, as shown in Figure 15, suggests setting these values to 1. 1003 The configuration in Figure 15 is only a few lines long, however, it includes 2 consistency 1004 constraints that could be violated by the developer, doubling the chances of a Misconfiguration 1005 bug. The first is the constraint on the numbers of threads in PinnedDispatcher, which is 1006 already violated. The second is the constraint that a PinnedDispatcher must be used with a 1007 thread-pool-executor, which is not violated. 1008 Incorrect configurations of actor systems, actor mailboxes, and remoting are a fewmore 1009 examples of our Misconfiguration root cause. 1010 Our Misconfiguration root cause is new and cannot be found in Hedden and Zhao’s [51] 1011 or Torres Lopez et al.’s [77] root causes for actor bugs. 1012 1013 5.10 Root Cause 10: Untyped Communication 1014 Untyped Communication is the least common (1.1%) root cause for our actor bugs. Correct 1015 actor communication requires that a receiver actor knows about the messages that its senders 1016 can send. Classic Akka is untyped and its actors do not know about the type of messages 1017 that they may receive. There is nothing that prevents the sender from sending a message that 1018 its receiver cannot receiver and process [40]. Akka developers are responsible to manually 1019 discover these message types and guarantee that the receiver can receive all the messages 1020 that its senders can send to it. Otherwise, miscommunications can cause bugs. 1021 Figure 16 shows a buggy actor program [16] with an Untyped Communication root cause. 1022 Here, the developer intends to receive and process all the messages that can be sent to 1023 the HttpHostConnector actor . HttpHostConnector receives messages of types HttpRequest, 1024 Disconnected, and DemandIdleShutdown to accept a connection from a client, disconnect the 1025 client, and shut down the service. 1026 However, this program is buggy with an Error symptom and an Untyped Communication 1027 root cause. The impact of this symptom is that HttpHostConnector does not receive and 1028 process all the messages that can be sent to it. The developer explains this symptom as, 1029 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:22 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

1030 “DeathPactException in HttpHostConnector.” An actor throws a DeathPactException when it 1031 receives a Terminated message but cannot process it. A message Terminated is sent to a 1032 watcher actor by its watched actor when the watched actor terminates. HttpHostConnector 1033 does not receive and process messages of type Terminated that can be sent to it. The fix for 1034 this bug, as shown in Figure 16, suggests receiving and processing Terminated. 1035 There are actor languages and frameworks 1036 that are less susceptible to our Untyped Com- bug class HttpHostConnector(..) extends Actor .. { .. 1037 munication bugs. For example, Akka Typed def receive: Receive = { 1038 [23] is a recent variation of Classic Akka that case request: HttpRequest => .. 1039 requires typed communications and can stat- case Disconnected(..) => .. case DemandIdleShutdown => .. } } 1040 ically guarantee the absence of our Untyped fix 1041 Communication bugs. In Classic Akka, the case Terminated(..) => .. 1042 bug in Figure 16 is detected at run time, Fig. 16. Actor bug [16] with an Untyped root cause. 1043 however, in Akka Typed, this bug can be detected at compile time. 1044 Our Untyped Communication root cause is new and cannot be found in Hedden and 1045 Zhao’s [51] or Torres Lopez et al.’s [77] root causes for actor bugs. 1046 1047 5.11 Implications 1048 Using our new root causes to extend the set of bugs to test for and find Tasharofi et al.97 [ ] 1049 identify the bug pattern Flexible Interfaces [77] and use it to generate test schedules for actor 1050 software. In this pattern, an actor can dynamically change the set of messages it receives 1051 and processes. The root cause of a bug with this pattern is that a new behavior cannot 1052 processes a message that the previous behaviors of the actor could process. Some of our root 1053 causes overlap with Hedden and Zhao’s [51] and Torres and Lopez et al.’s [77] root causes for 1054 actor bugs. However, our API Confusion, Model Confusion, Misnaming, Misconfiguration, 1055 and Untyped Communication, that are the root causes for 31.9% of our actor bugs, are new. 1056 Both previous and future actor testing and bug finding tools and techniques can discover 1057 the patterns of our new root causes and use these patterns to extend the set of bugs that 1058 they test for to increase the number of bugs they find. These patterns could include both 1059 syntactic and semantic patterns. For example, 29.6% of our Race bugs follow a syntactic 1060 pattern in which an entity, such as a future or a scheduled task, that can run outside and 1061 concurrent to an actor, closes over the mutable value of the sender() of the actor. Flexible 1062 Interfaces pattern is an example of a semantic pattern. Previous work, such as FindBugs 1063 [52] for non-actor Java bugs, shows that finding bugs does not “require sophisticated or 1064 extensive forms of analysis” and “many errors can be found with trivial static examination 1065 [of the code].” 1066 Supporting implicit life cycle management and typed communication to prevent bugs 1067 Orleans actor framework [82] supports more implicit and automatic life cycle management 1068 and is less susceptible to our Explicit Life cycle bugs. Similarly, Akka Typed [23] supports 1069 typed communications and is less susceptible to our Untyped Communication bugs. New 1070 variations of Akka can prevent our Explicit Life cycle and Untyped Communication bugs, 1071 that are 13.5% of our actor, by supporting more implicit life cycle and typed communications. 1072 Using the commonality of our bugs to make tradeoffs between different features Designers 1073 of new variations of Akka can use the commonality of our actor bugs, as one of many factors 1074 that they may consider, to make tradeoffs between features that these new variations may 1075 support. For example, the designer can make a tradeoff and choose to support the implicit 1076 life cycle management, that can prevent as much as 12.4% of our actor bugs, over typed 1077 communications, that can prevent only as much as 1.1%. 1078 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:23

1079 6 API USAGES 1080 In this section, we answer RQ3 and discuss the usages of Akka API by our actor bugs. 1081 Figure 17 shows the usages of Akka API 1082 packages, in a heatmap, where the darker 1083 the gray is, the more the usage is. At the 1084 bottom, the figure lists the usage numbers 1085 and percentages for used packages and their 1086 classes. The usage for packages and classes 1087 that are not listed is zero. For space reasons, 1088 in the heatmap, we shorten the name of 1089 a package to its second part. For example, 1090 actor is short for akka.actor. akka.actor(492)[81.2%]: ActorRef(188) ActorContext(119) Ac- 1091 torSystem(115) Props(19) Actor(14) AbstractActor(13) According to Figure 17, actor bugs Stash(8) Scheduler(6) AbstractFSM(4) ActorRefFactory(2) 1092 TypedActor(2) ChildRestartStats(1) TypedActorFactory(1) use the following 8 Akka API packages, akka.pattern(52)[9.1%]: AskableActorRef(40) BackoffSupervi- 1093 with decreasing commonalities: akka.actor, sor(6) Backoff(3) PipeableFuture(3) akka.testkit(14)[2.5%]: 1094 TestActorRef(4) TestKit(4) TestProbe(4) TestFSMRef(2) akka.pattern, akka.testkit, akka.event, akka.event(6)[1.1%]: EventStream(6) akka.serialization(3)[0.5%]: Se- 1095 rialization(3) akka.cluster(2)[0.4%]: Cluster(2) akka.io(1)[0.2%]: akka.serialization, akka.cluster, akka.io, Tcp(1) akka.routing(1)[0.2%]: Routing(1) 1096 and akka.routing. Among these packages, Fig. 17. Akka API usages of actor bugs 1097 akka.actor is the most common package (81.2%) and akka.routing is the least common 1098 (0.2%). The usages of packages by actor bugs are different and this difference can be significant 1099 for some API packages. For example, akka.actor is used 406 times more than akka.routing. 1100 In addition, 3 most common packages akka.actor, akka.pattern, and akka.testkit, that 1101 are only 2.7% of Akka API packages, are responsible for 97.7% of usages. These packages 1102 provide the basics to configure, create, lookup, reference, schedule, send and receive mes- 1103 sages, supervise, shutdown, and test actors. Packages for untyped actors, such as akka.actor 1104 (81.2%) and akka.cluster (0.4%), are used more than packages for typed actors, such as 1105 akka.actor.typed (0.0%) and akka.cluster.typed (0.0%). Similarly, packages for local ac- 1106 tors, such as akka.actor (81.2%) and akka.routing (0.2%), are used more than packages for 1107 remote actors, such as akka.serialization (0.5%) and akka.remote.routing (0.0%). The 1108 testing package akka.testkit is the third most common package (2.5%). 1109 Our observation about the uncommonality of Akka remote API agrees with Tasharofi 1110 et al.’s [98] observation that, “most developers use actors to address the local scalability 1111 problem, that is, they use actors as a solution for local concurrent programming [instead 1112 of remote programming].” Tasharofi et al. study fifteen large and mature Scala Akkaactor 1113 software. 1114 1115 6.1 Implications 1116 1117 Targeting less, untyped, and local Akka API packages to scale bug pattern mining to large 1118 APIs in large-scale code bases Previous work [68, 69] mines non-actor software code bases 1119 to discover their API usage patterns and use these patterns to find bugs that violate them. 1120 For example, for the class ReentrantLock in Java, the invocation of method unlock() after 1121 its lock() is a pattern that if violated can cause multithreaded bugs. One challenge in these 1122 works is to scale the pattern mining to large APIs in large-scale code bases. Similar tools 1123 and techniques can be developed for Akka software, however, they should address a similar 1124 challenge. Designers of these tools and techniques can use our observations about usages of 1125 Akka API to address this challenge. To scale to the large Akka API in large-scale code bases 1126 of Akka software, the designer can focus on only 2.7% of Akka packages, instead of all, that 1127 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:24 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

1128 are responsible for 97.7% of Akka API usages by our actor bugs. Similarly, they can focus on 1129 API for untyped actors, instead of typed actors, and local actors, instead of remote actors. 1130 1131 7 DIFFERENCES 1132 1133 In this section, we answer RQ4 and discuss the differences of our Stack Overflow and GitHub 1134 actor bugs for commonalities and distributions of their characteristics. 1135 1136 7.1 Symptoms 1137 60 1138 Figure 18 shows the symptoms of our actor 53.6 50 1139 bugs in Stack Overflow and GitHub. Accord- 40.0% 1140 ing to this figure, the common symptoms of 40 32.4

1141 actor bugs in Stack Overflow and GitHub 30 26.8% Percentages 1142 are different. For our Stack Overflow actor 20 17.7 10.8

bugs, Error is the most common symptom 9.0 1143 10 7.0 3.1% .0%

(40.0%) and Incorrect Exceptions is the least 0 42 52 30 5 6 23 9 4 1144 0 15 1145 common (0.0%). In contrast, for our GitHub Error Behavior Messaging Termination Exception Symptoms actor bugs, Unexpected Behavior is the most Stack Overflow GitHub 1146 Fig. 18. Symptoms in Stack Overflow and GitHub. 1147 common symptom (53.6%) and Incorrect Ex- 1148 ceptions is the least common (3.1%). In addition, the commonalities of some symptoms are 1149 significantly different among our Stack Overflow and GitHub actor bugs. Incorrect Messaging 1150 and Error are 3.6 and 1.5 times more common in Stack Overflow and there is no Incorrect 1151 Exceptions in GitHub. In contrast, Unexpected Behavior and Incorrect Termination are 3.1 1152 and 1.6 times more common in GitHub. Finally, although there is a significant difference 1153 between commonalities of individual symptoms between our Stack Overflow and GitHub 1154 actor bugs, there is no statistically significant difference among their distributions. 1155 1156 7.2 Root Causes 1157 Figure 19 shows the root causes of actor bugs 60 1158 57.2% in Stack Overflow and GitHub. According to 1159 50 this figure, the common root causes of actor 1160 40 bugs in Stack Overflow and GitHub are dif- 1161 30

ferent. For our Stack Overflow actor bugs, Percentages 18.5 1162 20 17.7 15.4 14.3 12.5 11.6 11.6 API Confusion is the most common root 10.8

1163 10 7.0 5.4 4.7 3.6 3.6% 3.1%

cause (18.5%) and Untyped Communication 1.8 1.8 0.0 0.0 0.0% 2 4 24 2 3 15 15 6 32 8 23 9 20 14 1 7 1164 1 is the least common (0.0%). In contrast, for 0 1165 Logic API Race Life Patterns Prog Model Naming Config Untyped our GitHub actor bugs, Logic is the most Root causes 1166 Stack Overflow GitHub common root cause (57.2%) and Program- 1167 Fig. 19. Root causes in Stack Overflow and GitHub. ming and Model Confusion are the least common (0.0%). In addition, the commonalities 1168 of some root causes are significantly different among Stack Overflow and GitHub actor 1169 bugs. Logic and Explicit Life cycle are 18.5 and 1.3 times more common in GitHub. In 1170 contrast, API Confusion, Misnaming, Misconfiguration Messaging Patterns, and Race are 1171 5.2, 3.9, 2.7, 2.2, and 1.3 times more common in Stack Overflow and there is no Programming 1172 and Model Confusion in GitHub. Finally, in addition to the significant differences between 1173 commonalities of individual root causes between Stack Overflow and GitHub actor bugs, 1174 there is a statically significant difference between their distributions. 1175 1176 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:25

1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 akka.actor(396)[87.0%]: ActorRef(139) ActorSystem(108) ActorContext(89) Props(19) AbstractActor(12) Ac- akka.actor(96)[82.8%]: ActorRef(49) ActorContext(30) 1187 tor(11) Stash(6) AbstractFSM(4) Scheduler(3) Ac- ActorSystem(7) Actor(3) Scheduler(3) Stash(2) Ab- torRefFactory(2) TypedActor(2) TypedActorFac- stractActor(1) ChildRestartStats(1) akka.testkit(7)[6.0%]: 1188 tory(1) akka.pattern(46)[10.1%]: AskableActorRef(34) TestFSMRef(2) TestKit(2) TestProbe(2) TestAc- 1189 BackoffSupervisor(6) Backoff(3) PipeableFuture(3) torRef(1) akka.pattern(6)[5.2%]: AskableActorRef(6) akka.testkit(7)[1.5%]: TestActorRef(3) TestKit(2) akka.serialization(2)[1.8%]: Serialization(2) akka.cluster(2)[1.8%]: 1190 TestProbe(2) akka.event(4)[0.9%]: EventStream(4) Cluster(2) akka.event(2)[1.8%]: EventStream(2) akka.serialization(1)[0.2%]: Serialization(1) akka.routing(1)[0.2%]: akka.io(1)[0.9%]: Tcp(1) 1191 Router(1) Fig. 20. API usages in Stack Overflow (left) and GitHub (right). 1192 1193 7.3 API Usages 1194 Figure 20 shows the usages of Akka API by our actor bugs in Stack Overflow and GitHub. 1195 According to this figure, 3 packages akka.actor, akka.pattern, and akka.testkit are respon- 1196 sible for most usages in both Stack Overflow (98.6%) and GitHub (94.0%). However, the 1197 usages of individual API packages by Stack Overflow and GitHub actor bugs are different and 1198 this difference can be significant for some packages. Packages akka.pattern and akka.actor 1199 are used 2 and 1.1 times more in Stack Overflow and there is no usage of akka.routing 1200 in GitHub. In contrast, akka.serialization, akka.event, and akka.testkit, are used 9, 4, 1201 and 2 times more in GitHub and there is no usage of akka.cluster and akka.io in Stack 1202 Overflow. In addition, although there is a significant difference between usages ofindividual 1203 packages between Stack Overflow and GitHub actor bugs, there is no statistically significant 1204 difference between their distributions. 1205 1206 7.4 Implications 1207 Using characteristics of our Stack Overflow and GitHub actor bugs for predictive bug finding 1208 Previous work uses the characteristics of bugs as features to build and train models that can 1209 predict bugs. For example, Zhou et al. [107] use characteristics, such as root causes and files 1210 involved in the bug, for multithreaded bugs in Mozilla, KDE, and Apache to train a model 1211 that can predict type, locations and quantities of similar bugs in multithreaded software. 1212 New bug prediction tools and techniques can use actor bugs and their characteristics in terms 1213 of their root causes, symptoms, API usages, and their differences as features to build and 1214 train models that can predict similar Akka actor bugs. In addition, our observations about 1215 the differences between the commonalities and distributions of these characteristics canbe 1216 used to decide which characteristics should be used as features for a specific dataset. For 1217 example, root causes of our Stack Overflow actor bugs may not be a good feature to train a 1218 model that predicts GitHub actor bugs. This is because of the significant differences between 1219 both commonalities and distributions of our Stack Overflow and GitHub root causes. 1220 1221 8 THREATS TO VALIDITY 1222 Using tags, keywords, and maturity criteria to identify Akka questions and answers in Stack 1223 Overflow and mature projects in GitHub could be a threat. This is because these maynot 1224 identify a complete set of Akka questions and answers and commits or identify irrelevant 1225 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:26 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

1226 ones. To minimize this threat, we closely follow the best practices of previous work in using 1227 tags [20, 26, 53, 54, 81, 104], keywords [36, 53, 54, 60, 88], and maturity criteria [33, 45, 84]. 1228 Additionally, we manually analyze each candidate question and answer and commit and 1229 discard irrelevant ones. 1230 Using Stack Overflow and GitHub as the sources for actor bugs is another threat. This 1231 is because they may not include a representative set of actor bugs. However, the large 1232 numbers of questions and answers and developers in Stack Overflow and projects, commits, 1233 and developers in GitHub helps to mitigate this threat. Furthermore, unlike some previous 1234 studies that use only Stack Overflow [20, 26, 30, 89, 104] or GitHub [51], we use both. The 1235 manual identification of actor bugs and fixes, their symptoms, root causes, API usages,and 1236 classifications could be another threat. To minimize this threat, we closely follow thebest 1237 practices of previous work in using open card sort [20, 26, 53, 84]. In addition, our manual 1238 analyses use all the available information for an Stack Overflow candidate question bug, 1239 including its question, all its answers and comments, and for a GitHub candidate commit 1240 bug, including its message, original code, modified code, issue reports, and pull requests. 1241 Multiple authors with extensive expertise in actor and multithreaded concurrency performed 1242 the manual analysis, where they iterated and refined their results until in agreement. 1243 The complexity of concurrency bugs—and its impact on their understanding, reporting, 1244 finding, and fixing—may be considered a threat. Simpler bugs are more likely to be understood 1245 and reported and easier to find and fix34 [ , 105]. Moreover, there could be bugs that are 1246 rarely or never reported, found, and fixed107 [ ]. To mitigate this threat, we closely follow the 1247 best practices of previous work [31, 51, 79, 102] in studying only the bugs that developers 1248 have found and fixed. Like other empirical studies, our findings should be interpreted with 1249 our methodology in mind and understood to hold only for Akka actor bugs written in Scala 1250 or Java from Stack Overflow and GitHub. 1251 1252 9 RELATED WORK 1253 Actor concurrency bugs The most related to our work are the works by Hedden and Zhao 1254 [51] and Torres Lopez et al. [77]. Section5 discusses these works and their overlaps with our 1255 work in detail. To summarize, Hedden and Zhao study 126 Akka actor bugs in twelve projects 1256 from ScalaIndex [91], classify their root causes into three categories and ten subcategories, 1257 and compare their actor bugs with cloud bugs. Our following root causes overlap with 1258 several of their root causes, with their root causes in parentheses: Logic (Logic), Race 1259 (Message Order, Operation Order, Cooperation), Explicit Life cycle (Creation, Shutdown, 1260 Recovery), Programming (Logic), and Messaging Patterns (Response). However, our four 1261 root causes API Confusion, Model Confusion, Misnaming, Misconfiguration, and Untyped 1262 Communication, that are root causes for 31.9% of our actor bugs, are new. Similarly, Torres 1263 Lopez et al. study twenty-three actor bugs in various actor models from eleven previous 1264 works and classify their root causes into two categories and six subcategories, and provide a 1265 list of static analyses, testing, debugging, and visualization tools that address these bugs. 1266 Our following root causes overlap with several of their root causes, with their root causes in 1267 parentheses: Race (Bad Message Interleaving, Memory Inconsistency). However, our nine 1268 root causes, i.e., Logic, API Confusion, Explicit Life cycle, Programming, Model Confusion, 1269 Misnaming, Untyped Communication, accounting for 85.4% of our actor bugs, are new. 1270 Verification of actor concurrency For verification, Gordon [44] proposes a program logic 1271 with modal assertions for deductive reasoning and verification of safety properties of actor 1272 programs in a core actor calculus. Charalambides et al. [37] propose a typestate system 1273 for static reasoning of liveness properties in a restricted class of actor programs written in 1274 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:27

1275 a simple actor language. Khatchadourian et al. [57] use typestate to efficiently and safely 1276 parallelize streams, another form of reactive programming. Bagherzadeh and Rajan [28] 1277 propose order types for static reasoning and detection of message race bugs. Haller and Loiko 1278 [47] propose a type system to control aliasing in Scala actor software. Bagherzadeh and 1279 Rajan [27] propose an interference-aware programming model to allow for modular reasoning 1280 and guarantee the absence of data races in Panini asynchronous message passing concurrency 1281 [87]. Khatchadourian et al. [56] also allow modular reasoning but for Aspect-Oriented 1282 Programming (AOP) using rely-guarantee clauses similar to those used for concurrent 1283 systems. Negara et al. [85] proposes a static analysis to infer and guarantee the single 1284 ownership property for actors. Clebsch et al. [38] use reference capabilities to guarantee the 1285 absence of data races in the combination of actor and multithreaded concurrency. Colaco et 1286 al. [39] proposes a type inference system to prevent orphan messages in a primitive actor 1287 calculus. D’Osualdo [41] propose an infinite-state model checker for a core fragment of Erlang 1288 actor concurrency. Stivenart et al. [95] propose a mailbox abstraction to statically reason 1289 about the bounds on the sizes for mailboxes. 1290 Testing and debugging of actor concurrency Lauterburg et al. [61] propose Basset for 1291 systematic exploration of message schedules for given inputs. Tasharofi et al. [97] propose 1292 Bita to explore message schedules for given inputs using different message coverage criteria. 1293 Tasharofi et al. [99] also propose TransDPOR, a dynamic partial order reduction technique 1294 to reduce the state-space of message schedules. Li et al. [67] propose Tap, a technique to 1295 generate system-level test cases for a given code location in Akka software. Sen and Agha 1296 propose dCute [93], a full coverage testing system to find deadlocks. For debugging, Torres 1297 Lopez et al. [78] propose Multiverse Debugging to allow for observation and debugging of all 1298 concurrent execution paths in AmbientTalk, an actor language for mobile adhoc networks. 1299 Multithreaded concurrency bugs Lu et al. [79] study 105 local concurrency bugs in 4 1300 server and client applications and classify their patterns, manifestation, fixing, and avoidance 1301 strategies. Leesatapornwongsa et al. [64] study 104 distributed concurrency bugs in 4 popular 1302 data center systems and classify their triggers, behaviors, and fixes. Gunawi et al. [46] study 1303 3,655 distributed concurrency issues in 6 popular cloud systems and classify their vitality, 1304 aspects, hardware, hardware failure mode, software, implications, and scope. Wang et al. 1305 [103] study 57 concurrency bugs in 53 open-source Node.js software and classify their root 1306 causes, patterns, impacts, manifestation, and fix strategies. 1307 Unlike works that focus on classification of root causes of actor bug alone, verification, 1308 testing, and debugging of actor software, or analysis and classification of multithreaded 1309 concurrency bugs, our work focuses on classifying symptoms, root causes, API usages, 1310 and differences for 186 real-world Akka actor bugs from Stack Overflow and GitHuband 1311 discussing real-world examples of bugs with these root causes and symptoms and how 1312 developers discuss these root causes and symptoms. Khatchadourian et al. [58] also report 1313 on API misuse but for streams. 1314 1315 10 CONCLUSIONS AND FUTURE WORK 1316 In this work, we construct and study a set of 186 real-world Akka actor bugs and their fixes 1317 from Stack Overflow and GitHub, understand and classify their symptoms, root causes, API 1318 usages, and differences, discuss real-world examples of actor bugs with these symptoms and 1319 root causes, investigate the relation of our findings with the findings of previous work, and 1320 discuss the implications of our findings. One avenue of the future work is to analyze and 1321 classify fixes of our actor bugs. Another avenue is to analyze our actor bugs to identifythe 1322 challenges in their detection and fixing. 1323 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:28 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

1324 REFERENCES 1325 [1] Stack Overflow. Actor name is not unique invalidactornameexception. https://stackoverflow.com/ 1326 questions/11693562/. 1327 [2] Stack Overflow. Akka actors app hangs under high volume. https://stackoverflow.com/questions/ 11141311. 1328 [3] Stack Overflow. Akka routing: Reply’s send to router ends up as dead letters. https://stackoverflow. 1329 com/questions/20564381/. 1330 [4] Stack Overflow. akka.io dispatcher configuration exception. https://stackoverflow.com/questions/ 1331 21686327. 1332 [5] Stack Overflow. Cannot establish remote communication with akka. https://stackoverflow.com/ questions/53188945. 1333 [6] Stack Overflow. correctly terminate akka actors in scala. https://stackoverflow.com/questions/ 1334 12324055/. 1335 [7] Stack Overflow. Doesn’t immediately stop child akka actors when parent actor stopped. https: 1336 //stackoverflow.com/questions/45574085. 1337 [8] GitHub. fix garbling of big tcp.write. https://github.com/spray/spray/commit/ 332ba626b193794fd4d753839e664aacd4d302a5. 1338 [9] Stack Overflow. How should an akka actor be created that might throw an exception? https: 1339 //stackoverflow.com/questions/18648390. 1340 [10] How to create a minimal, reproducible example. https://stackoverflow.com/help/minimal-reproducible- 1341 example. 1342 [11] Stack Overflow. How to use futures with akka for asynchronous results. https://stackoverflow.com/ questions/11693562/. 1343 [12] Porter stemming algorithm in NLTK 3.4.5. https://www.nltk.org/api/nltk.stem.html. 1344 [13] Stack Overflow. postrestart and prerestart methods are not getting invoked in akka actors. https: 1345 //stackoverflow.com/questions/37608915. 1346 [14] Stack Overflow. Scala akka actor - dead letters encountered. https://stackoverflow.com/questions/ 1347 53200356. [15] Stack Overflow. Send message to actor after restart from supervisor. https://stackoverflow.com/ 1348 questions/48446194. 1349 [16] GitHub. Spray project. https://github.com/spray/spray/commit/ 1350 e34da115fa43d4d46db0e7ae06eea7fbcbc4fdfd. 1351 [17] Akka actors in Spark. https://issues.apache.org/jira/browse/SPARK-5293, 2015. 1352 [18] Gul Agha. Actors: A Model of Concurrent Computation in Distributed Systems. MIT Press, Cambridge, MA, USA, 1986. 1353 [19] Gul Agha and Carl Hewitt. Concurrent programming using actors: Exploiting large-scale parallelism. In 1354 Proceedings of the Fifth Conference on Foundations of Software Technology and Theoretical Computer 1355 Science, pages 19–41, Berlin, Heidelberg, 1985. Springer-Verlag. 1356 [20] Syed Ahmed and Mehdi Bagherzadeh. What do concurrency developers ask about? a large-scale 1357 study using stack overflow. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM ’18, New York, NY, USA, 2018. Association 1358 for Computing Machinery. doi:10.1145/3239235.3239524. 1359 [21] Akka 2.6.5 API. https://doc.akka.io/api/akka/current/akka/index.html, May 2020. 1360 [22] Akka Actor Reference Config File. https://github.com/akka/akka/blob/master/akka-actor/src/main/ 1361 resources/reference.conf, April 2020. 1362 [23] Akka Typed. https://doc.akka.io/docs/akka/current/typed/index.html, May 2020. [24] Joe Armstrong. Programming Erlang: Software for a Concurrent World. Pragmatic Bookshelf, 2007. 1363 [25] Mehdi Bagherzadeh and Raffi Khatchadourian. Going big: A large-scale study on what bigdata 1364 developers ask. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering 1365 Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2019, pages 1366 432–442, New York, NY, USA, 2019. ACM. URL: http://doi.acm.org/10.1145/3338906.3338939, 1367 doi:10.1145/3338906.3338939. [26] Mehdi Bagherzadeh and Raffi Khatchadourian. Going big: A large-scale study on what bigdata 1368 developers ask. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering 1369 Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2019, pages 432– 1370 442, New York, NY, USA, 2019. Association for Computing Machinery. doi:10.1145/3338906.3338939. 1371 1372 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:29

1373 [27] Mehdi Bagherzadeh and Hridesh Rajan. Panini: A concurrent programming model for solving pervasive 1374 and oblivious interference. In Proceedings of the 14th International Conference on Modularity, 1375 MODULARITY 2015, pages 93–108, New York, NY, USA, 2015. ACM. URL: http://doi.acm.org/10. 1145/2724525.2724568, doi:10.1145/2724525.2724568. 1376 [28] Mehdi Bagherzadeh and Hridesh Rajan. Order types: Static reasoning about message races in 1377 asynchronous message passing concurrency. In Proceedings of the 7th ACM SIGPLAN International 1378 Workshop on Programming Based on Actors, Agents, and Decentralized Control, AGERE 2017, 1379 pages 21–30, New York, NY, USA, 2017. ACM. URL: http://doi.acm.org/10.1145/3141834.3141837, 1380 doi:10.1145/3141834.3141837. [29] Thomas Ball, Mayur Naik, and Sriram K. Rajamani. From symptom to cause: Localizing errors in 1381 counterexample traces. In Proceedings of the 30th ACM SIGPLAN-SIGACT Symposium on Principles 1382 of Programming Languages, POPL ’03, pages 97–105, New York, NY, USA, 2003. Association for 1383 Computing Machinery. doi:10.1145/604131.604140. 1384 [30] Anton Barua, Stephen W. Thomas, and Ahmed E. Hassan. What are developers talking about? an 1385 analysis of topics and trends in Stack Overflow. Empirical Softw. Engg., 19(3):619–654, June 2014. URL: http://dx.doi.org/10.1007/s10664-012-9231-y, doi:10.1007/s10664-012-9231-y. 1386 [31] Pamela Bhattacharya, Iulian Neamtiu, and Christian R. Shelton. Automated, highly-accurate, bug 1387 assignment using machine learning and tossing graphs. J. Syst. Softw., 85(10):2275–2292, October 1388 2012. doi:10.1016/j.jss.2012.04.053. 1389 [32] Francesco Adalberto Bianchi, Alessandro Margara, and Mauro Pezze. A survey of recent trends 1390 in testing concurrent software systems. IEEE Trans. Softw. Eng., 44(8):747–783, August 2018. doi:10.1109/TSE.2017.2707089. 1391 [33] Sumon Biswas, Md Johirul Islam, Yijia Huang, and Hridesh Rajan. Boa meets Python: A Boa 1392 dataset of data science software in Python language. In Proceedings of the 16th International 1393 Conference on Mining Software Repositories, MSR ’19, pages 577–581. IEEE Press, 2019. doi: 1394 10.1109/MSR.2019.00086. 1395 [34] Arkady Bron, Eitan Farchi, Yonit Magid, Yarden Nir, and Shmuel Ur. Applications of synchronization coverage. In Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice 1396 of Parallel Programming, PPoPP ’05, pages 206–212, New York, NY, USA, 2005. Association for 1397 Computing Machinery. doi:10.1145/1065944.1065972. 1398 [35] Rafael Caballero, Enrique Martin-Martin, Adrián Riesco, and Salvador Tamarit. A core erlang 1399 semantics for declarative debugging. J. Log. Algebraic Methods Program., 107:1–37, 2019. doi: 1400 10.1016/j.jlamp.2019.05.002. [36] Casey Casalnuovo, Prem Devanbu, Abilio Oliveira, Vladimir Filkov, and Baishakhi Ray. Assert use 1401 in GitHub projects. In Proceedings of the 37th International Conference on Software Engineering - 1402 Volume 1, ICSE ’15, pages 755–766. IEEE Press, 2015. 1403 [37] Minas Charalambides, Karl Palmskog, and Gul Agha. Types for Progress in Actor Programs, pages 1404 315–339. Springer International Publishing, Cham, 2019. doi:10.1007/978-3-030-21485-2_18. 1405 [38] Sylvan Clebsch, Sophia Drossopoulou, Sebastian Blessing, and Andy McNeil. Deny capabilities for safe, fast actors. In Proceedings of the 5th International Workshop on Programming Based on Actors, 1406 Agents, and Decentralized Control, AGERE! 2015, pages 1–12, New York, NY, USA, 2015. ACM. URL: 1407 http://doi.acm.org/10.1145/2824815.2824816, doi:10.1145/2824815.2824816. 1408 [39] J-L. Colaço, M. Pantel, and P. Sallé. A Set-Constraint-based analysis of Actors, pages 107–122. 1409 Springer US, Boston, MA, 1997. doi:10.1007/978-0-387-35261-9_8. 1410 [40] Joeri De Koster, Tom Van Cutsem, and Wolfgang De Meuter. 43 years of actors: A taxonomy of actor models and their key properties. In Proceedings of the 6th International Workshop on Programming 1411 Based on Actors, Agents, and Decentralized Control, AGERE 2016, pages 31–40, New York, NY, USA, 1412 2016. Association for Computing Machinery. doi:10.1145/3001886.3001890. 1413 [41] Emanuele D’Osualdo, Jonathan Kochems, and C. H. Luke Ong. Automatic verification of Erlang-style 1414 concurrency. In Francesco Logozzo and Manuel Fähndrich, editors, Static Analysis, pages 454–476, 1415 Berlin, Heidelberg, 2013. Springer Berlin Heidelberg. [42] Sally Fincher and Josh Tenenberg. Making sense of card sorting data. Expert Systems, 1416 22(3):89–93, 2005. URL: https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1468-0394.2005.00299. 1417 x, arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1468-0394.2005.00299.x, doi:10. 1418 1111/j.1468-0394.2005.00299.x. 1419 [43] GitHub Inc. The state of Octoverse. https://octoverse.github.com/. 1420 1421 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:30 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

1422 [44] Colin S. Gordon. Modal assertions for actor correctness. In Proceedings of the 9th ACM SIGPLAN 1423 International Workshop on Programming Based on Actors, Agents, and Decentralized Control, AGERE 1424 2019, pages 11–20, New York, NY, USA, 2019. ACM. URL: http://doi.acm.org/10.1145/3358499. 3361221, doi:10.1145/3358499.3361221. 1425 [45] Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, and Sunghun Kim. Deep API learning. In Proceedings 1426 of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 1427 FSE 2016, pages 631–642, New York, NY, USA, 2016. Association for Computing Machinery. doi: 1428 10.1145/2950290.2950334. 1429 [46] Haryadi S. Gunawi, Mingzhe Hao, Tanakorn Leesatapornwongsa, Tiratat Patana-anake, Thanh Do, Jeffry Adityatama, Kurnia J. Eliazar, Agung Laksono, Jeffrey F. Lukman, Vincentius Martin,and 1430 Anang D. Satria. What bugs live in the cloud? a study of 3000+ issues in cloud systems. In Proceedings 1431 of the ACM Symposium on Cloud Computing, SOCC ’14, pages 1–14, New York, NY, USA, 2014. 1432 Association for Computing Machinery. doi:10.1145/2670979.2670986. 1433 [47] Philipp Haller and Alex Loiko. Lacasa: Lightweight affinity and object capabilities in Scala. SIGPLAN 1434 Not., 51(10):272–291, October 2016. URL: http://doi.acm.org/10.1145/3022671.2984042, doi:10.1145/ 3022671.2984042. 1435 [48] Robert H. Halstead. Multilisp: A language for concurrent symbolic computation. ACM Trans. Program. 1436 Lang. Syst., 7(4):501–538, October 1985. doi:10.1145/4472.4478. 1437 [49] Quinn Hanam, Fernando S. de M. Brito, and Ali Mesbah. Discovering bug patterns in javascript. In 1438 Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software 1439 Engineering, FSE 2016, pages 144–156, New York, NY, USA, 2016. Association for Computing Machinery. doi:10.1145/2950290.2950308. 1440 [50] Yaroslav Hayduk, Anita Sobe, and Pascal Felber. Dynamic message processing and transactional 1441 memory in the actor model. In Alysson Bessani and Sara Bouchenak, editors, Distributed Applications 1442 and Interoperable Systems, pages 94–107, Cham, 2015. Springer International Publishing. 1443 [51] Brandon Hedden and Xinghui Zhao. A comprehensive study on bugs in actor systems. In Proceedings of 1444 the 47th International Conference on Parallel Processing, ICPP 2018, pages 56:1–56:9, New York, NY, USA, 2018. ACM. URL: http://doi.acm.org/10.1145/3225058.3225139, doi:10.1145/3225058.3225139. 1445 [52] David Hovemeyer and William Pugh. Finding bugs is easy. SIGPLAN Not., 39(12):92–106, December 1446 2004. doi:10.1145/1052883.1052895. 1447 [53] Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. A comprehensive study on deep 1448 learning bug characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software 1449 Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2019, pages 510–520, New York, NY, USA, 2019. ACM. URL: http://doi.acm.org/10.1145/3338906.3338955, 1450 doi:10.1145/3338906.3338955. 1451 [54] Md Johirul Islam, Rangeet Pan, Giang Nguyen, and Hridesh Rajan. Repairing deep neural networks fix 1452 patterns and challenges. In Proceedings of the 42nd International Conference on Software Engineering 1453 - Volume 2, ICSE ’20. IEEE Press, 2020. 1454 [55] Khari Johnson. GitHub passes 100 million repositories. https://venturebeat.com/2018/11/08/github- passes-100-million-repositories/. 1455 [56] Raffi Khatchadourian, Johan Dovland, and Neelam Soundarajan. Enforcing behavioral constraints in 1456 evolving aspect-oriented programs. In Proceedings of the 7th Workshop on Foundations of Aspect- 1457 Oriented Languages, FOAL ’08, pages 19–28, New York, NY, USA, 2008. Association for Computing 1458 Machinery. doi:10.1145/1394496.1394499. 1459 [57] Raffi Khatchadourian, Yiming Tang, Mehdi Bagherzadeh, and Syed Ahmed. Safe automated refactoring for intelligent parallelization of java 8 streams. In Proceedings of the 41st International Conference 1460 on Software Engineering, ICSE ’19, pages 619–630, Piscataway, NJ, USA, 2019. IEEE Press. doi: 1461 10.1109/ICSE.2019.00072. 1462 [58] Raffi Khatchadourian, Yiming Tang, Mehdi Bagherzadeh, and Baishakhi Ray. An empirical studyon 1463 the use and misuse of Java 8 streams. In International Conference on Fundamental Approaches to 1464 Software Engineering, FASE 2020, pages 97–118. ETAPS, Springer, April 2020. doi:10.1007/978-3- 030-45234-6_5. 1465 [59] Atul S. Khot. Concurrent Patterns and Best Practices: Build Scalable Apps with Patterns in 1466 Multithreading, Synchronization, and Functional Programming. Packt Publishing, 2018. 1467 [60] Sunghun Kim and E. James Whitehead. How long did it take to fix bugs? In Proceedings of the 2006 1468 International Workshop on Mining Software Repositories, MSR ’06, pages 173–174, New York, NY, 1469 USA, 2006. Association for Computing Machinery. doi:10.1145/1137983.1138027. 1470 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:31

1471 [61] Steven Lauterburg, Mirco Dotta, Darko Marinov, and Gul Agha. A framework for state-space 1472 exploration of Java-based actor programs. In Proceedings of the 2009 IEEE/ACM International 1473 Conference on Automated Software Engineering, ASE ’09, pages 468–479, Washington, DC, USA, 2009. IEEE Computer Society. doi:10.1109/ASE.2009.88. 1474 [62] Steven Lauterburg, Rajesh K. Karmani, Darko Marinov, and Gul Agha. Evaluating ordering heuristics 1475 for dynamic partial-order reduction techniques. In Proceedings of the 13th International Conference 1476 on Fundamental Approaches to Software Engineering, FASE’10, pages 308–322, Berlin, Heidelberg, 1477 2010. Springer-Verlag. URL: http://dx.doi.org/10.1007/978-3-642-12029-9_22, doi:10.1007/978-3- 1478 642-12029-9_22. [63] Edward A. Lee. The problem with threads. Computer, 39(5):33–42, May 2006. doi:10.1109/MC.2006. 1479 180. 1480 [64] Tanakorn Leesatapornwongsa, Jeffrey F. Lukman, Shan Lu, and Haryadi S. Gunawi. Taxdc: A 1481 taxonomy of non-deterministic concurrency bugs in datacenter distributed systems. In Proceedings 1482 of the Twenty-First International Conference on Architectural Support for Programming Languages 1483 and Operating Systems, ASPLOS ’16, pages 517–530, New York, NY, USA, 2016. ACM. URL: http://doi.acm.org/10.1145/2872362.2872374, doi:10.1145/2872362.2872374. 1484 [65] Mark C. Lewis and Lisa L. Lacher. Object-Orientation, Abstraction, and Data Structures Using Scala, 1485 Second Edition. Chapman & Hall/CRC, 2nd edition, 2016. 1486 [66] He Li, Jie Luo, and Wei Li. A formal semantics for debugging synchronous message passing-based 1487 concurrent programs. Science China Information Sciences, 57(12):1–18, 2014. 1488 [67] Sihan Li, Farah Hariri, and Gul Agha. Targeted Test Generation for Actor Systems. In Todd Millstein, editor, 32nd European Conference on Object-Oriented Programming (ECOOP 2018), volume 109 of 1489 Leibniz International Proceedings in Informatics (LIPIcs), pages 8:1–8:31, Dagstuhl, Germany, 2018. 1490 Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. URL: http://drops.dagstuhl.de/opus/volltexte/ 1491 2018/9213, doi:10.4230/LIPIcs.ECOOP.2018.8. 1492 [68] Zhenmin Li and Yuanyuan Zhou. Pr-miner: Automatically extracting implicit programming rules and 1493 detecting violations in large software code. In Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of 1494 Software Engineering, ESEC/FSE-13, pages 306–315, New York, NY, USA, 2005. ACM. URL: 1495 http://doi.acm.org/10.1145/1081706.1081755, doi:10.1145/1081706.1081755. 1496 [69] B. Liang, P. Bian, Y. Zhang, W. Shi, W. You, and Y. Cai. Antminer: Mining more bugs by reducing 1497 noise interference. In 2016 IEEE/ACM 38th International Conference on Software Engineering 1498 (ICSE), pages 333–344, May 2016. doi:10.1145/2884781.2884870. [70] Lightbend. Akka documentation: Supervision and monitoring. https://doc.akka.io/docs/akka/2.5/ 1499 general/supervision.html. 1500 [71] Lightbend. Akka.actor documentation. https://doc.akka.io/api/akka/current/akka/actor/Actor.html. 1501 [72] Lightbend. Akka. https://www.lightbend.com/akka-part-of-lightbend-platform, 2019. 1502 [73] Lightbend. Customer case studies. https://www.lightbend.com/case-studies#filter:akka, 2019. 1503 [74] Lightbend. How Groupon scales personalized offers to 48 million customers on time. https://www.lightbend.com/case-studies/groupon-scalability-personalized-offers-to-48-million- 1504 customers, 2020. 1505 [75] Lightbend. PayPal blows past 1 billion transactions per day using just 8 VMs with Akka, Scala, Kafka 1506 and Akka Streams. https://www.lightbend.com/case-studies/paypal-blows-past-1-billion-transactions- 1507 per-day-using-just-8-vms-and-akka-scala-kafka-and-akka-streams, 2020. 1508 [76] Yuheng Long, Mehdi Bagherzadeh, Eric Lin, Ganesha Upadhyaya, and Hridesh Rajan. On ordering problems in message passing software. In Proceedings of the 15th International Conference on 1509 Modularity, MODULARITY 2016, pages 54–65, New York, NY, USA, 2016. ACM. URL: http: 1510 //doi.acm.org/10.1145/2889443.2889444, doi:10.1145/2889443.2889444. 1511 [77] Carmen Torres Lopez, Stefan Marr, Elisa Gonzalez Boix, and Hanspeter Mössenböck. A study of 1512 concurrency bugs and advanced development support for actor-based programs. In Programming with 1513 Actors - State-of-the-Art and Research Perspectives, pages 155–185. 2018. doi:10.1007/978-3-030- 00302-9\_6. 1514 [78] Carmen Torres Lopez, Robbert Gurdeep Singh, Stefan Marr, Elisa Gonzalez Boix, and Christophe 1515 Scholliers. Multiverse debugging: Non-deterministic debugging for non-deterministic programs (brave 1516 new idea paper). In Alastair F. Donaldson, editor, 33rd European Conference on Object-Oriented 1517 Programming, ECOOP 2019, July 15-19, 2019, London, United Kingdom, volume 134 of LIPIcs, pages 1518 27:1–27:30. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPIcs.ECOOP. 1519 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. 1:32 Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian

1520 2019.27. 1521 [79] Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. Learning from mistakes: A comprehensive 1522 study on real world concurrency bug characteristics. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XIII, pages 1523 329–339, New York, NY, USA, 2008. ACM. URL: http://doi.acm.org/10.1145/1346281.1346323, 1524 doi:10.1145/1346281.1346323. 1525 [80] Manuel Bernhardt. Akka anti-patterns: race condition. https://manuel.bernhardt.io/2016/08/16/akka- 1526 anti-patterns-race-conditions/. 1527 [81] Na Meng, Stefan Nagy, Danfeng (Daphne) Yao, Wenjie Zhuang, and Gustavo Arango Argoty. Secure coding practices in Java: Challenges and vulnerabilities. In Proceedings of the 40th International 1528 Conference on Software Engineering, ICSE ’18, pages 372–383, New York, NY, USA, 2018. Association 1529 for Computing Machinery. doi:10.1145/3180155.3180201. 1530 [82] Microsoft. https://dotnet.github.io/orleans/index.html. 1531 [83] Martin Monperrus. A critical review of "automatic patch generation learned from human-written 1532 patches": Essay on the problem statement and the evaluation of automatic software repair. In Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pages 234–242, 1533 New York, NY, USA, 2014. Association for Computing Machinery. doi:10.1145/2568225.2568324. 1534 [84] Sarah Nadi, Stefan Krüger, Mira Mezini, and Eric Bodden. Jumping through hoops: Why do Java 1535 developers struggle with cryptography APIs? In Proceedings of the 38th International Conference on 1536 Software Engineering, ICSE ’16, pages 935–946, New York, NY, USA, 2016. Association for Computing 1537 Machinery. doi:10.1145/2884781.2884790. [85] Stas Negara, Rajesh K. Karmani, and Gul Agha. Inferring ownership transfer for efficient message 1538 passing. SIGPLAN Not., 46(8):81–90, February 2011. URL: http://doi.acm.org/10.1145/2038037. 1539 1941566, doi:10.1145/2038037.1941566. 1540 [86] Netty. https://netty.io/, May 2020. 1541 [87] Hridesh Rajan. Capsule-oriented programming. In Proceedings of the 37th International Conference 1542 on Software Engineering - Volume 2, ICSE ’15, pages 611–614. IEEE Press, 2015. [88] Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, and Premkumar 1543 Devanbu. On the "naturalness" of buggy code. In Proceedings of the 38th International Conference on 1544 Software Engineering, ICSE ’16, pages 428–439, New York, NY, USA, 2016. Association for Computing 1545 Machinery. doi:10.1145/2884781.2884848. 1546 [89] Christoffer Rosen and Emad Shihab. What are mobile developers asking about? a large scale studyusing 1547 stack overflow. Empirical Softw. Engg., 21(3):1192–1223, June 2016. doi:10.1007/s10664-015-9379-3. [90] Scala 2.13.2 API. https://www.scala-lang.org/api/current/scala/index.html, May 2020. 1548 [91] ScalaIndex. The scala library index. https://index.scala-lang.org/. 1549 [92] Christophe Scholliers, Eric Tanter, and Wolfgang De Meuter. Parallel actor monitors: Disentangling 1550 task-level parallelism from data partitioning in the actor model. Science of Computer Programming, 1551 80:52 – 64, 2014. Special section on foundations of coordination languages and software architectures 1552 (selected papers from FOCLASA’10), Special section - Brazilian Symposium on Programming Languages (SBLP 2010) and Special section on formal methods for industrial critical systems (Selected papers 1553 from FMICS’11). URL: http://www.sciencedirect.com/science/article/pii/S0167642313000725, doi: 1554 https://doi.org/10.1016/j.scico.2013.03.011. 1555 [93] Koushik Sen and Gul Agha. Automated systematic testing of open distributed programs. In Proceedings 1556 of the 9th International Conference on Fundamental Approaches to Software Engineering, FASE’06, 1557 pages 339–356, Berlin, Heidelberg, 2006. Springer-Verlag. doi:10.1007/11693017_25. [94] Stack Exchange Dump. https://archive.org/details/stackexchange, December 2019. 1558 [95] Quentin Stiévenart, Jens Nicolay, Wolfgang De Meuter, and Coen De Roover. Mailbox Abstrac- 1559 tions for Static Analysis of Actor Programs. In Peter Müller, editor, 31st European Confer- 1560 ence on Object-Oriented Programming (ECOOP 2017), volume 74 of Leibniz International Pro- 1561 ceedings in Informatics (LIPIcs), pages 25:1–25:30, Dagstuhl, Germany, 2017. Schloss Dagstuhl– 1562 Leibniz-Zentrum fuer Informatik. URL: http://drops.dagstuhl.de/opus/volltexte/2017/7254, doi: 10.4230/LIPIcs.ECOOP.2017.25. 1563 [96] Janwillem Swalens, Stefan Marr, Joeri De Koster, and Tom Van Cutsem. Towards composable 1564 concurrency abstractions. In Proceedings of the Workshop on Programming Language Approaches to 1565 Concurrency and communication-cEntric Software (PLACES), volume 155 of PLACES ’14, pages 1566 54–60, April 2014. URL: http://arxiv.org/abs/1406.3485, doi:10.4204/EPTCS.155.8. 1567 1568 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020. Actor Concurrency Bugs in Akka: Symptoms, Root Causes, API Usages, and Differences 1:33

1569 [97] S. Tasharofi, M. Pradel, Y. Lin, and R. Johnson. Bita: Coverage-guided, automatic testing ofactor 1570 programs. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering 1571 (ASE), pages 114–124, Nov 2013. doi:10.1109/ASE.2013.6693072. [98] Samira Tasharofi, Peter Dinges, and Ralph E. Johnson. Why do scala developers mix theactor 1572 model with other concurrency models? In Proceedings of the 27th European Conference on Object- 1573 Oriented Programming, ECOOP’13, pages 302–326, Berlin, Heidelberg, 2013. Springer-Verlag. URL: 1574 http://dx.doi.org/10.1007/978-3-642-39038-8_13, doi:10.1007/978-3-642-39038-8_13. 1575 [99] Samira Tasharofi, Rajesh K. Karmani, Steven Lauterburg, Axel Legay, Darko Marinov, andGul 1576 Agha. TransDPOR: A novel dynamic partial-order reduction technique for testing actor pro- grams. In Proceedings of the 14th Joint IFIP WG 6.1 International Conference and Proceedings 1577 of the 32Nd IFIP WG 6.1 International Conference on Formal Techniques for Distributed Sys- 1578 tems, FMOODS’12/FORTE’12, pages 219–234, Berlin, Heidelberg, 2012. Springer-Verlag. URL: 1579 http://dx.doi.org/10.1007/978-3-642-30793-5_14, doi:10.1007/978-3-642-30793-5_14. 1580 [100] The GHTorrent project. https://ghtorrent.org/, April 2020. 1581 [101] Carmen Torres Lopez, Elisa Gonzalez Boix, Christophe Scholliers, Stefan Marr, and Hanspeter Mössenböck. A principled approach towards debugging communicating event-loops. In Proceedings 1582 of the 7th ACM SIGPLAN International Workshop on Programming Based on Actors, Agents, 1583 and Decentralized Control, AGERE 2017, pages 41–49, New York, NY, USA, 2017. Association for 1584 Computing Machinery. doi:10.1145/3141834.3141839. 1585 [102] Tengfei Tu, Xiaoyu Liu, Linhai Song, and Yiying Zhang. Understanding real-world concurrency bugs 1586 in go. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’19, pages 865–878, New York, NY, USA, 1587 2019. ACM. URL: http://doi.acm.org/10.1145/3297858.3304069, doi:10.1145/3297858.3304069. 1588 [103] Jie Wang, Wensheng Dou, Yu Gao, Chushu Gao, Feng Qin, Kang Yin, and Jun Wei. A comprehensive 1589 study on real world concurrency bugs in node.js. In Proceedings of the 32Nd IEEE/ACM International 1590 Conference on Automated Software Engineering, ASE 2017, pages 520–531, Piscataway, NJ, USA, 1591 2017. IEEE Press. URL: http://dl.acm.org/citation.cfm?id=3155562.3155628. [104] Xin-Li Yang, David Lo, Xin Xia, Zhi-Yuan Wan, and Jian-Ling Sun. What security questions do 1592 developers ask? a large-scale study of Stack Overflow posts. Journal of Computer Science and 1593 Technology, 31(5):910–924, Sep 2016. doi:10.1007/s11390-016-1672-0. 1594 [105] Jie Yu. Finding and Tolerating Concurrency Bugs. PhD thesis, The University of Michigan, 2013. 1595 [106] Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. An empirical study on 1596 TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2018, pages 129–140, New York, NY, USA, 2018. ACM. URL: 1597 http://doi.acm.org/10.1145/3213846.3213866, doi:10.1145/3213846.3213866. 1598 [107] Bo Zhou, Iulian Neamtiu, and Rajiv Gupta. Predicting concurrency bugs: How many, what kind and 1599 where are they? In Proceedings of the 19th International Conference on Evaluation and Assessment 1600 in Software Engineering, EASE ’15, New York, NY, USA, 2015. Association for Computing Machinery. 1601 doi:10.1145/2745802.2745807. [108] Thomas Zimmermann, Nachiappan Nagappan, Philip J. Guo, and Brendan Murphy. Characterizing 1602 and predicting which bugs get reopened. In Proceedings of the 34th International Conference on 1603 Software Engineering, ICSE ’12, pages 1074–1083. IEEE Press, 2012. 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 Proc. ACM Program. Lang., Vol. 1, No. OOPSLA, Article 1. Publication date: November 2020.