BRUCE: a Distributed Code Execution Service
Total Page:16
File Type:pdf, Size:1020Kb
BRUCE: A Distributed Code Execution Service Dept. of CIS — Senior Design 2014-2015∗ Lewis Ellis Ashu Goel [email protected] [email protected] University of Pennsylvania University of Pennsylvania Philadelphia, PA Philadelphia, PA Jeff Grimes Max Scheiber [email protected] [email protected] University of Pennsylvania University of Pennsylvania Philadelphia, PA Philadelphia, PA ABSTRACT 2. RELATED WORK BRUCE is a distributed system that compiles, runs, tests, BRUCE's related work falls into four categories. The first and grades arbitrary code segments based on a given set of is open-source software, such as PC2, independently devel- input and output. BRUCE uses multiple workers to han- oped for use by other institutions. The second is proprietary dle large numbers of simultaneous submissions during major submission software developed by the maintainers of pro- events such as Google Code Jam, Top Coder contests, and gramming contests such as Project Euler and Google Code homework submission deadlines for large classes. Jam. The third is code sandboxing, or the act of ensuring that the possible damage done by malicious or ill-formed code is limited. The fourth is master/worker systems, a 1. INTRODUCTION type of distributed system in which work is distributed to n Many computer science courses use automated grading slave machines under the coordination of a master machine. mechanisms to test student assignment submissions and as- sign scores. Similarly, many online \programming challenge" sites (such as HackerRank) use similar automated grading 2.1 Code Submission Systems mechanisms to test challenge submissions. These systems There currently exist several popular broadly-accessible tend to be unique per class with no overlap in design or struc- code submission systems, usually aimed at programming ture; for example, many computer science courses at the contests, as well as many automatic code testing systems University of Pennsylvania rely on completely different sys- proprietary to the universities that built them. tems to test and grade student submissions. Furthermore, they tend to be heavily language-dependent; a class that uses Marmoset Project [9] multiple languages (for example, CIS 120 uses OCaml and Developer: University of Maryland Java) would effectively need to create two separate backend grading systems for each language, even though the front- The Marmoset Project is a system for handling student end, student-facing side is the same. This situation not only programming project submission, testing and code review. makes designing and creation of the system difficult, but It was developed at the University of Maryland five years ago also causes many maintenance problems as the system gets and recently handled 74,000 submissions over the course of handed down to the new course staff each semester. Simi- a semester. Marmoset is language-agnostic, working with larly, websites such as HackerRank must waste engineering all programming languages, and is robust enough to han- man-hours creating these systems and maintaining them, dle both large and small submissions. With BRUCE, we much like how a course staff must do so for a university. want to emulate the way that Marmoset can handle any The goal of BRUCE is to fix this problem by creating a programming language. Given the wide variety of program- distributed back-end service that allows clients (such as the ming contests that are currently run (and especially given course staff or a programming challenge site) to run a suite that the frequency of these contests will only increase given of tests on submitted code and provide results. To the client, the recent positive national attention computer science has the system would be a \black-box" API and/or web interface been receiving), we believe that it is crucially important to that would simply require inputs, outputs, and submitted handle all major programming languages. It would be a sig- code. It would enable the same functionality as existing nificant barrier to adoption if only certain languages were systems, but with no designing, creation, or maintenance on supported. behalf of the client. Fundamentally, this would allow clients Marmoset never runs student code on the machines that to focus on providing their core service to end-users, and also host the submit server and the database. Build servers in- create a standardized automated distributed testing service stead connect to the submit server and present credentials for all use cases. for the University of Maryland courses that they are au- thorized to build projects for. If there are any appropriate ∗Advisor: Stephanie Weirich ([email protected]). submissions, then the build server downloads them before disconnecting from the submit server and building and test- is somewhat poorly designed; it is held together by several ing the submission. Finally, the results of the tests are up- bash scripts. It is consequently hard to modify and debug, loaded to the submit server. Thus, Marmoset can run mul- which is a potential pain point for adopting the system. We tiple build servers for a course to provide both redundancy want the ease of iteration in development to be high for and quick turnaround time for project testing. We also want BRUCE. to emulate Marmoset in this area. Security and privacy, es- pecially for universities (which have to comply with national Vocareum [10] laws about the nondisclosure of sensitive student data such as their student IDs), are paramount concerns for code sub- A recent entrant into the Code Submission System space mission systems. is a company called Vocareum. Vocareum is a learning man- agement system (LMS) similar to Canvas, which is used PC2 [6] at the University of Pennsylvania. The difference is that Developer: California State University, Sacramento Vocareum places an explicit emphasis on computer science courses. The system allows for functionality such as assign- PC2, or the Programming Contest Control System, was ment authoring and management, in-context communica- developed at California State University, Sacramento (CSUS) tion, analytics, and grading automation. Some of the Uni- to support the activities of the ACM International Collegiate versity of Pennsylvania computer science classes are consid- Programming Contest and its regional contests around the ering using Vocareum in the future. world. PC2 allows contestants in these contests to submit While Vocareum is a robust solution with many features, their code segments to contest judges, who can recompile it is not clear whether or not it executes code the way and execute the code as well as view test results alongside BRUCE intends to. Vocareum is more frontend- and user- the code itself. PC2 also has a feature called \automated focused, while BRUCE is more backend- and API-focused. judging" that grades software algorithmically. BRUCE will In some ways, we see BRUCE being the eventual code exe- emulate PC2 in this regard. We hope to add a number of cution backend for a service like Vocareum. plugins during our subsequent development next semester, The largest difference between Vocareum and BRUCE is some of which will focus on this ability to grade software. a matter of scope. BRUCE is a niche service targeted specif- Having this functionality will make BRUCE even more use- ically for code execution, while Vocareum is an end-to-end ful during programming contests. service that provides functionality for multiple facets of a The PC2 system automatically timestamps and archives computer science class. For example, Vocareum contains submitted runs. It also supports contests being held simul- a portal for professor-student communication and tools for taneously at different sites by automatically transmitting teaching assistants to manage information about the class, information and generating a single contest-wide page for including grades. each remote site. PC2 can handle any language and any language development tool. PC2 has several other features Sphere Research [8] that are beyond the scope of the infrastructure service por- tion of BRUCE, such as contest submission scoreboard and Sphere Research is similar to BRUCE in that it is a back- communication tools for teams and judges, but could easily end for code execution. Sphere has its own API that allows be built on a web application using BRUCE as a service. developers to plug-in and run code on their own systems, as However, PC2 requires each contestant to install and con- BRUCE does. We have based some of our system architec- figure the system. Some newer operating systems, such as ture on that of Sphere Research. Windows 8, have trouble running the software. This down- There are two main problems, however, with Sphere Re- side is something we certainly want to avoid with BRUCE. search that BRUCE hopes to improve upon. First, Sphere Programming contest organizers need to have the latest tech- Research currently uses a SOAP API, which is an outdated nology for their submission systems. As the contests become paradigm compared to REST APIs used today. BRUCE more popular, they may try to differentiate themselves by plans to use the more modern REST API approach. Sec- ease of entry for participants. Participants will enjoy using ond, the documentation for Sphere Research is poor, making systems that are fast and cutting-edge, so we want BRUCE it clunky and hard to use. We plan on creating extensive to be in both of those categories. documentation as well as client libraries and sample appli- cations to remedy this issue. University of Pennsylvania CIS 1xx Autograder [15] Developer: University of Pennsylvania 2.2 Websites with Code Submission Penn uses an automatic grading infrastructure for its CIS Many websites exist that use code submission as part of 110, CIS 120, and CIS 121 courses. This system can hypo- their product. These range from online programming con- thetically support any language via implementing a custom tests to coding scratchpads to interview-specific websites.