Scouts: Improving the Diagnosis Process Through Domain-Customized Incident Routing

Total Page:16

File Type:pdf, Size:1020Kb

Scouts: Improving the Diagnosis Process Through Domain-Customized Incident Routing Scouts: Improving the Diagnosis Process Through Domain-customized Incident Routing Jiaqi Gao¢, Nofel Yaseen⋄, Robert MacDavid∗, Felipe Vieira Frujeri◦, Vincent Liu⋄, Ricardo Bianchini◦ Ramaswamy Adityax, Xiaohang Wangx, Henry Leex, David Maltzx, Minlan Yu¢, Behnaz Arzani◦ ¢Harvard University ⋄University of Pennsylvania ∗Princeton University ◦Microsoft Research xMicrosoft ABSTRACT service-level objectives. When incidents are mis-routed (sent to the Incident routing is critical for maintaining service level objectives wrong team), their time-to-diagnosis can increase by 10× [21]. in the cloud: the time-to-diagnosis can increase by 10× due to mis- A handful of well-known teams that underpin other services routings. tend to bear the brunt of this effect. The physical networking team Properly routing incidents is challenging because of the complex- in our large cloud, for instance, is a recipient in 1 in every 10 mis- ity of today’s data center (DC) applications and their dependencies. routed incidents (see §3). In comparison, the hundreds of other For instance, an application running on a VM might rely on a possible teams typically receive 1 in 100 to 1 in 1000. These findings functioning host-server, remote-storage service, and virtual and are common across the industry (see Appendix A). physical network components. It is hard for any one team, rule- Incident routing remains challenging because modern DC ap- based system, or even machine learning solution to fully learn the plications are large, complex, and distributed systems that rely complexity and solve the incident routing problem. We propose a on many sub-systems and components. Applications’ connections different approach using per-team Scouts. Each teams’ Scout actsas to users, for example, might cross the DC network and multiple its gate-keeper — it routes relevant incidents to the team and routes- ISPs, traversing firewalls and load balancers along the way. Anyof away unrelated ones. We solve the problem through a collection these components may be responsible for connectivity issues. The of these Scouts. Our PhyNet Scout alone — currently deployed in internal architectures and the relationships between these compo- production — reduces the time-to-mitigation of 65% of mis-routed nents may change over time. In the end, we find that the traditional incidents in our dataset. method of relying on humans and human-created rules to route incidents is inefficient, time-consuming, and error-prone. CCS CONCEPTS Instead, we seek a tool that can automatically analyze these complex relationships and route incidents to the team that is most • Computing methodologies → Machine learning; • Networks likely responsible; we note that machine learning (ML) is a potential → Data center networks; match for this classification task. In principle, a single, well-trained KEYWORDS ML model could process the massive amount of data available from operators’ monitoring systems—too vast and diverse for humans— Data center networks; Machine learning; Diagnosis to arrive at an informed prediction. Similar techniques have found ACM Reference Format: success in more limited contexts (e.g., specific problems and/or Jiaqi Gao, Nofel Yaseen, Robert MacDavid, Felipe Vieira Frujeri, Vincent applications) [11, 15, 22, 25, 73]. Unfortunately, we quickly found Liu, RicardoBianchini, Ramaswamy Aditya, Xiaohang Wang, Henry Lee, operationalizing this monolithic ML model comes with fundamental David Maltz, Minlan Yu,Behnaz Arzani. 2020. Scouts: Improving the Diag- technical and practical challenges: nosis Process Through Domain-customized Incident Routing. In Annual A constantly changing set of incidents, components, and monitoring conference of the ACM Special Interest Group on Data Communication on the data: As the root causes of incidents are addressed and components applications, technologies, architectures, and protocols for computer communi- cation (SIGCOMM ’20), August 10–14, 2020, Virtual Event, NY, USA. ACM, evolve over time, both the inputs and the outputs of the model are New York, NY, USA, 17 pages. https://doi.org/10.1145/3387514.3405867 constantly in flux. When incidents change, we are often left without enough training data and when components change, we potentially 1 INTRODUCTION need to retrain across the entire fleet. For cloud providers, incident routing — taking an issue that is too Curse of dimensionality: A monolithic incident router needs to in- complex for automated techniques and assigning it to a team of clude monitoring data from all teams. This large resulting feature engineers — is a critical bottleneck to maintaining availability and vector leads to “the curse of dimensionality” [4]. The typical solu- tion of increasing the number of training examples in proportion to Permission to make digital or hard copies of all or part of this work for personal or the number of features is not possible in a domain where examples classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation (incidents) are already relatively rare events. on the first page. Copyrights for components of this work owned by others than ACM Uneven instrumentation: A subset of teams will always have gaps must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a in monitoring, either because the team has introduced new compo- fee. Request permissions from [email protected]. nents and analytics have not caught up, or because measuring is just SIGCOMM ’20 hard, e.g., in active measurements where accuracy and overhead © 2020 Association for Computing Machinery. ACM ISBN 978-1-4503-7955-7/20/08...$15.00 are in direct contention [34]. https://doi.org/10.1145/3387514.3405867 SIGCOMM ’20, August 10–14, 2020, Virtual Event, NY, USA Gao et al. Limited visibility into every team: For the same reasons that it is 3) The design of a Scout for Microsoft Azure’s physical networking difficult for teams to have expertise in all surrounding components, team accompanied by a framework to enable its evolution as the it is difficult for us to understand the appropriate feature setsfrom team’s monitoring systems, incidents, and responsibilities change. each and every team. 4) A thorough evaluation of the deployed PhyNet Scout and analysis Rather than building a single, monolithic predictor, we argue a of incidents in our cloud from the past year and a discussion of the piecewise solution based on a collection of (strategically assigned) challenges the Scout encountered in practice. per-team predictors, a.k.a. Scouts, is more useful. Scouts are low- This paper is the first to propose a decomposed solution tothe overhead, low-latency, and high-accuracy tools that predict, for a incident routing problem. We take the first step in demonstrating given team, whether the team should be involved. They are built by such a solution can be effective by building a Scout for the PhyNet the team to which they apply and are amenable to partial deploy- team of Microsoft Azure. This team was one of the teams most ment. Scouts address the above challenges: they only need to adapt heavily impacted by the incident routing problem. As such, it was a to changes to their team and its components (instead of all changes), good first candidate to demonstrate the benefits Scouts can provide; they operate over a more limited feature set (no longer suffer the we leave the detailed design of other teams’ Scouts for future work. curse of dimensionality), they limit the need for understanding the internals of every team (they only need to encode information about the team they are designed for and its local dependencies), 2 BACKGROUND: INCIDENT ROUTING and only require local instrumentation. Scouts can utilize a hybrid Incidents constitute unintended behavior that can potentially im- of supervised and unsupervised models to account for changes to pact service availability and performance. Incidents are reported incidents (see §5) and can provide explanations as to why they de- by customers, automated watchdogs, or discovered and reported cided the team is (not) responsible. Operators can be strategic about manually by operators. which Scouts they need: they can build Scouts for teams (such as Incident routing is the process through which operators decide our physical networking team) that are inordinately affected by which team should investigate an incident. In this context, we mis-routings. Given a set of Scouts, operators can incrementally use team to broadly refer to both internal teams in the cloud and compose them, either through a global routing system or through external organizations such as ISPs. Today, operators use run-books, the existing manual process. past-experience, and a natural language processing (NLP)-based We designed, implemented, and deployed a Scout for the physical recommendation system (see §7), to route incidents. Specifically, networking team of a large cloud.1 We focus on this team as, from incidents are created and routed using a few methods: our study of our cloud and other operators, we find the network — 1) By automated watchdogs that run inside the DC and monitor and specifically the physical network — suffers inordinately from the health of its different components. When a watchdog uncovers mis-routing (see §3). This team exhibits all of the challenges of a problem it follows a built-in set of rules to determine where it Scout construction: diverse, dirty datasets; complex dependencies should send the incident. inside and outside the provider; many reliant services; and frequent 2) As Customer Reported Incidents (CRIs) which go directly to a changes. As the team evolves, the framework we developed adapts 24 × 7 support team that uses past experience and a number of automatically and without expert intervention through the use of specialized tools to determine where to send the incident. If the meta-learning techniques [46]. cause is an external problem, the team contacts the organization These same techniques can be used to develop new “starter” responsible.
Recommended publications
  • Hermes: a Scalable Event-Based Middleware
    UCAM-CL-TR-590 Technical Report ISSN 1476-2986 Number 590 Computer Laboratory Hermes: A scalable event-based middleware Peter R. Pietzuch June 2004 15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom phone +44 1223 763500 http://www.cl.cam.ac.uk/ c 2004 Peter R. Pietzuch This technical report is based on a dissertation submitted February 2004 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Queens’ College. Technical reports published by the University of Cambridge Computer Laboratory are freely available via the Internet: http://www.cl.cam.ac.uk/TechReports/ ISSN 1476-2986 Abstract Large-scale distributed systems require new middleware paradigms that do not suffer from the limitations of traditional request/reply middleware. These limitations include tight coupling between components, a lack of information filtering capabilities, and support for one-to-one communication semantics only. We argue that event-based middleware is a scalable and power- ful new type of middleware for building large-scale distributed systems. However, it is important that an event-based middleware platform includes all the standard functionality that an appli- cation programmer expects from middleware. In this thesis we describe the design and implementation of Hermes, a distributed, event- based middleware platform. The power and flexibility of Hermes is illustrated throughout for two application domains: Internet-wide news distribution and a sensor-rich, active building. Hermes follows a type- and attribute-based publish/subscribe model that places particular emphasis on programming language integration by supporting type-checking of event data and event type inheritance. To handle dynamic, large-scale environments, Hermes uses peer-to-peer techniques for autonomic management of its overlay network of event brokers and for scalable event dissemination.
    [Show full text]
  • Bridging the Data Charging Gap in the Cellular Edge
    Bridging the Data Charging Gap in the Cellular Edge Yuanjie Li, Kyu-Han Kim, Christina Vlachou, Junqing Xie Hewlett Packard Labs {yuanjiel,kyu-han.kim,christina.vlachou,jun-qing.xie}@hpe.com ABSTRACT licensed spectrum and support carrier-grade QoS for low latency. The 4G/5G cellular edge promises low-latency experiences any- We have seen some early operational edge services via 4G/5G (x2.2). where, anytime. However, data charging gaps can arise between The cellular network data is not free. To enjoy the low-latency the cellular operators and edge application vendors, and cause services, the edge app vendors should pay the cellular operators by over-/under-billing. We find that such gap can come from data loss, traffic usage (similar to our mobile data plans). Even for the“un- selfish charging, or both. It can be amplified in the edge, duetoits limited” data plans, the edge app’s network speed will be throttled low-latency requirements. We devise TLC, a Trusted, Loss-tolerant (e.g., 128Kbps) if its usage exceeds pre-defined quota. This model Charging scheme for the cellular edge. In its core, TLC enables loss- inherits from 4G/5G [6, 7], and is standardized in the edge [8]. selfishness cancellation to bridge the gap, and constructs publicly However, the cellular operator and edge app vendor may not verifiable, cryptographic proof-of-charging for mutual trust. Weim- always agree on how much data should be paid. A data charging plement TLC with commodity edge nodes, OpenEPC and small cells. gap can arise, i.e., the data volume charged by the cellular opera- Our experiments in various edge scenarios validate TLC’s viability tor differs from what the edge sends/receives.
    [Show full text]
  • [Front Matter]
    Future Technologies Conference (FTC) 2017 29-30 November 2017| Vancouver, Canada About the Conference IEEE Technically Sponsored Future Technologies Conference (FTC) 2017 is a second research conference in the series. This conference is a part of SAI conferences being held since 2013. The conference series has featured keynote talks, special sessions, poster presentation, tutorials, workshops, and contributed papers each year. The goal of the conference is to be a world's pre-eminent forum for reporting technological breakthroughs in the areas of Computing, Electronics, AI, Robotics, Security and Communications. FTC 2017 is held at Pan Pacific Hotel Vancouver. The Pan Pacific luxury Vancouver hotel in British Columbia, Canada is situated on the downtown waterfront of this vibrant metropolis, with some of the city’s top business venues and tourist attractions including Flyover Canada and Gastown, the Vancouver Convention Centre, Cruise Ship Terminal as well as popular shopping and entertainment districts just minutes away. Pan Pacific Hotel Vancouver has great meeting rooms with new designs. We chose it for the conference because it’s got plenty of space, serves great food and is 100% accessible. Venue Name: Pan Pacific Hotel Vancouver Address: Suite 300-999 Canada Place, Vancouver, British Columbia V6C 3B5, Canada Tel: +1 604-662-8111 3 | P a g e Future Technologies Conference (FTC) 2017 29-30 November 2017| Vancouver, Canada Preface FTC 2017 is a recognized event and provides a valuable platform for individuals to present their research findings, display their work in progress and discuss conceptual advances in areas of computer science and engineering, Electrical Engineering and IT related disciplines.
    [Show full text]
  • SIGOPS Annual Report 2012
    SIGOPS Annual Report 2012 Fiscal Year July 2012-June 2013 Submitted by Jeanna Matthews, SIGOPS Chair Overview SIGOPS is a vibrant community of people with interests in “operatinG systems” in the broadest sense, includinG topics such as distributed computing, storaGe systems, security, concurrency, middleware, mobility, virtualization, networkinG, cloud computinG, datacenter software, and Internet services. We sponsor a number of top conferences, provide travel Grants to students, present yearly awards, disseminate information to members electronically, and collaborate with other SIGs on important programs for computing professionals. Officers It was the second year for officers: Jeanna Matthews (Clarkson University) as Chair, GeorGe Candea (EPFL) as Vice Chair, Dilma da Silva (Qualcomm) as Treasurer and Muli Ben-Yehuda (Technion) as Information Director. As has been typical, elected officers agreed to continue for a second and final two- year term beginning July 2013. Shan Lu (University of Wisconsin) will replace Muli Ben-Yehuda as Information Director as of AuGust 2013. Awards We have an excitinG new award to announce – the SIGOPS Dennis M. Ritchie Doctoral Dissertation Award. SIGOPS has lonG been lackinG a doctoral dissertation award, such as those offered by SIGCOMM, Eurosys, SIGPLAN, and SIGMOD. This new award fills this Gap and also honors the contributions to computer science that Dennis Ritchie made durinG his life. With this award, ACM SIGOPS will encouraGe the creativity that Ritchie embodied and provide a reminder of Ritchie's leGacy and what a difference a person can make in the field of software systems research. The award is funded by AT&T Research and Alcatel-Lucent Bell Labs, companies that both have a strong connection to AT&T Bell Laboratories where Dennis Ritchie did his seminal work.
    [Show full text]
  • Llista Congressos Notables
    Llistat de congressos notables per la UPC (Ordenat pel nom del congrés) Servei d'Informació RDI (Per dubtes: [email protected]). 5 de febrer de 2018 Nom del congrés Acrònim AAAI Conference on Artificial Intelligence AAAI ACM Conference on Computer and Communications Security CCS ACM Conference on Human Factors in Computing Systems CHI ACM European Conference on Computer Systems EuroSys ACM Great Lakes Symposium on VLSI GLSVLSI ACM International Conference on Information and Knowledge Management CIKM ACM International Conference on Measurement and Modelling of Computer Systems SIGMETRICS ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems MSWiN ACM International Conference on Multimedia ACMMM ACM International Symposium on Mobile AdHoc Networking and Computing ACM MOBIHOC ACM International Symposium on Modeling, Analysis and Simulation of Wireless and Mobile Systems MSWiM ACM International Symposium on Physical Design ISPD ACM Object-Oriented Programming, Systems, Languages & Applications OOPSLA ACM Principles of Programming Languages POPL ACM SIGCOMM Special Interest Group on Data Communications SIGCOMM ACM SIGMOD Conference ACM SIGMOD ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation PLDI ACM Symposium on Applied Computing SAC ACM Symposium on Operating Systems Principles SOSP ACM Symposium on Parallel Algorithms and Architectures SPAA ACM Symposium on Principles of Database Systems PODS ACM Symposium on Principles of Distributed Computing PODC ACM Symposium on
    [Show full text]
  • Annual Report
    ANNUAL REPORT 2019FISCAL YEAR ACM, the Association for Computing Machinery, is an international scientific and educational organization dedicated to advancing the arts, sciences, and applications of information technology. Letter from the President It’s been quite an eventful year and challenges posed by evolving technology. for ACM. While this annual Education has always been at the foundation of exercise allows us a moment ACM, as reflected in two recent curriculum efforts. First, “ACM’s mission to celebrate some of the many the ACM Task Force on Data Science issued “Comput- hinges on successes and achievements ing Competencies for Undergraduate Data Science Cur- creating a the Association has realized ricula.” The guidelines lay out the computing-specific over the past year, it is also an competencies that should be included when other community that opportunity to focus on new academic departments offer programs in data science encompasses and innovative ways to ensure at the undergraduate level. Second, building on the all who work in ACM remains a vibrant global success of our recent guidelines for 4-year cybersecu- the computing resource for the computing community. rity curricula, the ACM Committee for Computing Edu- ACM’s mission hinges on creating a community cation in Community Colleges created a related cur- and technology that encompasses all who work in the computing and riculum targeted at two-year programs, “Cybersecurity arena” technology arena. This year, ACM established a new Di- Curricular Guidance for Associate-Degree Programs.” versity and Inclusion Council to identify ways to create The following pages offer a sampling of the many environments that are welcoming to new perspectives ACM events and accomplishments that occurred over and will attract an even broader membership from the past fiscal year, none of which would have been around the world.
    [Show full text]
  • About Technews About SIG Newsletters
    PRINT AND ONLINE ADVERTISING OPPORTUNITIES About TechNews About SIG Newsletters TechNews is an email digest of computing and technology ACM’s 37 Special Interest Groups (SIGs) represent news gathered from leading sources; distributed Monday, the major disciplines of the dynamic computing fi eld. Wednesday, and Friday to a circulation of over 105,000 ACM’s SIGs are invested in advancing the skills of their subscribers. Its concise summaries are perfect for busy members, keeping them abreast of emerging trends and professionals who need and want to keep up with the driving innovation across a broad spectrum of computing latest industry developments. disciplines. TechNews is regularly cited as one of ACM’s most valued As a member benefit, many ACM SIGs provide its members benefits and is one of the best ways to communicate with with a print or online newsletter covering news and events ACM members. within the realm of their fields. Circulation SIGACCESS: ACM SIGACCESS Newsletter* SIGACT: SIGACT News Listserv 105,000 SIGAda: Ada Letters SIGAI: AI Matters* Online Advertising Opportunities SIGAPP: Applied Computing Review* Right-hand sidebar position SIGBED: SIGBED Review* Size Dimensions Rates SIGBio: ACM SIGBio Record* Top Banner 468 x 60 IMU $6500/Month* SIGCAS: Computers & Society Newsletter* Skyscraper 160 x 600 IMU $6000/Month* SIGCOMM: Computer Communication Review* Square Ad 160 x 160 IMU $2500/Month* SIGCSE: SIGCSE Bulletin* SIGDOC: Communication Design Quarterly* * 12 Transmissions SIGecom: ACM SIGecom Exchanges* Maximum File Size:
    [Show full text]
  • Joel Sommers
    JOEL SOMMERS Department of Computer Science, Colgate University 306 McGregory Hall 13 Oak Drive Hamilton, NY 13346 315-228-7587 [email protected], http://cs.colgate.edu/~jsommers/ EDUCATION University of Wisconsin-Madison, Madison, WI, 2001-2007. Ph.D. in Computer Science, August 2007. Dissertation title: Calibrated network measurement. Adviser: Paul Barford Ph.D. minor in science and technology studies (distributed option). Worcester Polytechnic Institute, Worcester, MA, 1995-1997. M.S. in Computer Science, May 1997. Thesis title: Merging client and server profiles on the world-wide web. Adviser: Craig Wills Atlantic Union College, South Lancaster, MA, 1990-1995. B.S. in Computer Science, May 1995. B.S. in Mathematics, May 1995. ACADEMIC EXPERIENCE Associate Professor, Department of Computer Science, Colgate University, Hamilton, NY, 2013-present. Visiting Associate Professor, School of Engineering and Computer Science, Victoria University of Wellington, Wellington, New Zealand, October 2016-March 2017. Assistant Professor, Department of Computer Science, Colgate University, Hamilton, NY, 2007-2012. Research Assistant, Computer Sciences Department, University of Wisconsin, Madison, WI, 2001-2007. Projects included the Harpoon network traffic generator, the BADABING loss measurement methodology, and compliance measurement of service level agreements (SLAs). Teaching Assistant, Computer networking (CS640), University of Wisconsin, 2001. Prepared and graded programming projects, held office hours, graded papers and exams. Research Assistant, Computer Science Department, Worcester Polytechnic Institute, 1996-1997. Studied performance and security aspects of electronic commerce systems. Laboratory Instructor, Stanborough Secondary School, Watford, England, 1992-1993. Prepared and administered biology, chemistry, and physics labs for secondary school students. PROFESSIONAL EXPERIENCE Lead Developer, First Options of Chicago, Subsidiary of Goldman Sachs, Chicago, IL, 2000-2001.
    [Show full text]
  • SIGPLAN FY '05 Annual Report
    SIGPLAN FY '05 Annual Report July 2004—June 2005 Submitted by Jack W. Davidson, SIGPLAN Chair This year ACM SIGPLAN has continued its active sponsorship of many conferences and workshops as well as its two newsletters. SIGPLAN's present financial situation is strong, and our fund balance grew in FY 2005 after three consec- utive years of losses. Our fund balance comfortably exceeds the required minimum. Our conferences overall incurred financial gains, including OOPSLA, our largest conference, which had incurred a significant financial loss for each of the three preceding years. We were more selective with funding worthwhile projects such as student travel, funding these at about one half the level of recent years. A good resource for monitoring our activities is our web page, found at http://www.acm.org/sigplan/. 1. Conferences We sponsored seven annual conferences last year, GPCE (with SIGSOFT), ICFP, LCTES (with SIGBED), OOP- SLA, PLDI, POPL (with SIGACT), and PPDP. We also sponsored PPoPP and ISMM, which are held approxi- mately biannually. Of these conferences, PLDI, POPL and PPoPP appear in the Citeseer top 15 of more than 1200 Computer Science publication venues, based on their citation rates. We sponsored numerous workshops, including AADEBUG, BUGS, CUFP, Erlang, FOOL, Haskell, IVME, MSP, PLAN-X, Scheme, TLDI, and PEPM. Financial results for our conferences were positive. Conference attendance has been holding steady, with a dramatic increase in student participation. Conferences continue to receive far more submissions than we can accept, and our major conferences continue to be extremely selective. We have separate steering committees for all of our conferences.
    [Show full text]
  • 2021 ACM Awards Call for Nominations
    Turing Award The A. M. Turing Award is ACM's oldest and most prestigious award. It is presented annually to an individual or a group of individuals who have made lasting contributions of a technical nature to the computing community. The long-term influence of a candidate’s work is taken into consideration, but there should be a singular outstanding and trend-setting technical achievement that constitutes the claim of the award. The award is presented each June at the ACM Awards Banquet and is accompanied by a prize of $1,000,000 plus travel expenses to the banquet. Financial support for the award is provided by Google Inc. ACM Prize in Computing The ACM Prize in Computing recognizes an early to mid-career fundamental and innovative contribution in computing theory or practice that through, its impact, and broad implications, exemplifies the greatest achievements of the discipline. The candidate’s contribution should be relatively recent (typically within the last decade), but enough time should have passed to evaluate impact. While there are no specific requirements as to age or time since last degree requirements, the candidate typically would be approaching mid-career. The Prize carries a prize of $250,000. Financial support for the award is provided by Infosys Ltd. ACM Frances E. Allen Award for Outstanding Mentoring The Frances E. Allen Award for Outstanding Mentoring will be presented for the first time in 2021. This award will recognize individuals who have exemplified excellence and/or innovation in mentoring with particular attention to individuals who have shown outstanding leadership in promoting diversity, equity, and inclusion in computing.
    [Show full text]
  • Yuriy Brun Curriculum Vitae June 27, 2021 Page 1 of 32 Honors, Awards, Fellowships
    Yuriy Brun College of Information and Computer Sciences +1-609-379-2786 University of Massachusetts [email protected] 140 Governors Dr., Amherst, MA 01003-9264 http://people.cs.umass.edu/brun/ Research Interests Software systems engineering, software fairness, self-adaptation, behavioral inference, and automated repair. Education UNIVERSITY OF SOUTHERN CALIFORNIA . Los Angeles, CA, USA 5/2008 Doctor of Philosophy in Computer Science Dissertation: Self-assembly for discreet, fault-tolerant, and scalable computation on Internet-sized distributed networks Advisor: Prof. Nenad Medvidovic´ 5/2006 Master of Science in Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY . Cambridge, MA, USA 9/2003 Master of Engineering in Electrical Engineering and Computer Science Thesis: Fault identification via dynamic analysis and machine learning Advisor: Prof. Michael D. Ernst 6/2003 Bachelor of Science in Computer Science and Engineering 6/2003 Bachelor of Science in Mathematics Employment History UNIVERSITY OF MASSACHUSETTS . Amherst, MA, USA 9/2021 – present Professor 9/2017 – 8/2021 Associate Professor 9/2012 – 8/2017 Assistant Professor Co-director: Laboratory for Advanced Software Engineering Research (LASER) Co-director: Programming Languages and Systems at Massachusetts (PLASMA) UNIVERSITY OF WASHINGTON . Seattle, WA, USA 9/2009 – 8/2012 NSF CRA Postdoctoral Computing Innovation Fellow UNIVERSITY OF SOUTHERN CALIFORNIA . Los Angeles, CA, USA 7/2008 – 9/2009 Postdoctoral Research Associate: Center for Systems and Software Engineering 8/2003 –
    [Show full text]
  • ACM SIGCHI, SIGGRAPH, SIGPLAN and SIGSOFT, and in Cooperation
    Software visualization encompasses the development and evaluation of methods for graphically representing different aspects of software, including its structure, its abstract and concrete execution, and its evolution. The goal of this symposium is to provide a forum for researchers from different backgrounds (HCI, software engineering, programming languages, visualization, computer science education) to discuss and present original research on software visualization. SoftVis '03 is the first meeting in a planned series of biennial conferences. Our objective is to make the SoftVis series the premier venue for presenting all types of research on software visualization. Papers June 11-13, 2003 We seek theoretical as well as practical papers on applications, techniques, tools and case studies. San Diego, USA Topics of interest include, but are not restricted to, the following: Sponsored by: · Visualization of algorithms, including numerical, geometric, genetic, distributed and graph algorithms ACM SIGCHI, SIGGRAPH, · Program visualization SIGPLAN and SIGSOFT, · Protocol and log visualization (security, trust) and in cooperation with ACM · Visualization of parallel programs SIGCSE. · Visualization in software engineering, e.g. UML diagrams · Visualization of the software development process Deadlines · Visualization of data and processes in applications · Educational software in computer science, in particular visualization of computational models · Graph drawing algorithm for software visualization Paper submission: · Visualization of data base schemes December 16, 2002 · Visual debugging · 3D software visualization Tutorial submission: · Software visualization on the internet January 15, 2003 · Program analyses and visualization · Integration of software visualization tools and development environments · Empirical evaluation of software visualization system effectiveness Notification of acceptance: January 27, 2003 Papers should represent original, unpublished results and will be rigorously reviewed by the international Program Committee.
    [Show full text]