The Three Pillars of Machine Programming

The Three Pillars of Machine Programming Justin Gottschlich Armando Solar-Lezama Nesime Tatbul Intel Labs, USA MIT, USA Intel Labs and MIT, USA [email protected] [email protected] [email protected] Michael Carbin Martin Rinard Regina Barzilay MIT, USA MIT, USA MIT, USA [email protected] [email protected] [email protected] Saman Amarasinghe Joshua B. Tenenbaum Tim Mattson MIT, USA MIT, USA Intel Labs, USA [email protected] [email protected] [email protected] Abstract In this position paper, we describe our vision of the future of machine programming through a categorical examination of three pillars of research. Those pillars are: (i) intention, (ii) invention, and (iii) adaptation. Intention emphasizes advance- ments in the human-to-computer and computer-to-machine- learning interfaces. Invention emphasizes the creation or refinement of algorithms or core hardware and software building blocks through machine learning (ML). Adaptation emphasizes advances in the use of ML-based constructs to autonomously evolve software. Keywords program synthesis, machine programming, software development, software maintenance, intention, invention, adaptation Figure 1. The Three Pillars of Machine Programming: Inten- 1 Introduction tion, Invention, and Adaptation. Each pillar in the diagram Programming is a cognitively demanding task that requires includes a few example sub-domains generally related to extensive knowledge, experience and a large degree of cre- them. ativity, and is notoriously difficult to automate. Machine learning (ML) has the capacity to reshape the way software is developed. At some level, this has already begun, it adapts to changes in the program’s goals, errors in the as machine-learned components progressively replace com- program, and the features in new computer platforms. A plex hand-crafted algorithms in domains such as natural- machine programming system is any system that automates language understanding and vision. Yet, we believe that it is some or all of the steps of turning the user’s intent into an arXiv:1803.07244v3 [cs.AI] 26 Jun 2021 possible to move much further. We envision machine learn- executable program and maintaining that program over time. ing and automated reasoning techniques that will enable new The automation of programming has been a goal of the programming systems; systems that will deliver a significant programming systems community since the birth of Fortran degree of automation to reduce the cost of producing secure, in the 1950s. The first paper on “The FORTRAN Automatic correct, and efficient software. These systems will also en- Coding System” made it clear that its goal was “for the 704 able non-programmers to harness the full power of modern [IBM’s next large computer] to code problems for itself and computing platforms to solve complex problems correctly produce as good programs as human coders (but without and efficiently. We call such efforts machine programming. the errors)” [8]. The broader AI community has also been interested in automatic programming dating back to the Pro- 1.1 Why Now? grammer’s Apprentice Project back in the late 1970s [64]. Programming is the process of turning a problem defini- A number of technological developments over the past few tion (the intent) into a sequence of instructions that when years, however, are creating both the need and the opportu- executed on a computer, produces a solution to the origi- nity for transformative advances in our ability to use ma- nal problem. Over time, a program must be maintained as chines to help users write software programs. Intel Labs, MIT, 2018 Gottschlich et al. Opportunity Humans interact through speech, images, adaptation. Each of these pillars corresponds to a class of and gestures; so-called “natural inputs”. Advances in deep capabilities that we believe are necessary to transform the learning and related machine learning technologies have dra- programming landscape. matically improved a computer’s ability to associate meaning Intention is the ability of the machine to understand the with natural inputs. Deep learning also makes it possible to programmer’s goals through more natural forms of inter- efficiently represent complex distributions over classes of action, which is critical to reducing the complexity of writ- structured objects; a crucial capability if one wants to auto- ing software. Invention is the ability of the machine to dis- matically synthesize a program using probabilistic or direct cover how to accomplish such goals; whether by devising transformation techniques. In parallel to advances in ma- new algorithms, or even by devising new abstractions from chine learning, the programming systems community has which such algorithms can be built. Adaptation is the ability been making notable advances in its ability to reason about to autonomously evolve software, whether to make it exe- programs and manipulate them. Analyzing thousands of cute efficiently on new or existing platforms, or to fix errors lines of code to derive inputs that expose a bug has moved and address vulnerabilities. As suggested in Figure1, inten- from an intractable problem into one that is routinely solved tion, invention, and adaptation intersect in interesting ways. due to advances in automated reasoning tools such as SAT When advances are made in one pillar, another pillar may and SMT solvers. Data, a key enabler for learning-based be directly or indirectly influenced. In Section5 we show strategies, is also more available now than at any time in that sometimes such influence can be negative. This further the past. This is the byproduct of at least two factors: (i) the emphasizes the importance for the machine programming emergence of code repositories, such as GitHub, and (ii) the research community to be cognizant of these pillars moving growing magnitude of the web itself, where it is possible forward and to understand how their research interacts with to observe and analyze the code (e.g., JavaScript) powering them. many web applications. Finally, the advent of cloud comput- The remainder of this report is organized as follows. In ing makes it possible to harness large-scale computational Sections2,3,4, we provide a detailed examination of inten- resources to solve complex analysis and inference problems tion, invention, and adaptation, respectively. We also discuss that were out of reach only a few years ago. the interactions between them, throughout. In Section5, we provide a concrete analysis of verified lifting [36] and how Need The end of Dennard scaling means that performance it interacts with each of the three pillars (in some cases, dis- improvements now come through increases in the complex- ruptively). We close with a discussion on the impact of data, ity of the hardware, with resulting increases in the com- as it is the cornerstone for many ML-based advances. plexity of compilation targets [27]. Traditional compilation techniques rely on an accurate model of relatively simple 2 Intention hardware. These techniques are inadequate for exploiting the full potential of heterogeneous hardware platforms. The Intention corresponds to the class of challenges involved in time is ripe for techniques, based on modern machine learn- capturing the user’s intent in a way that does not require ing, that learn to map computations onto multiple platforms substantial programming expertise. One of the major chal- for a single application. Such techniques hold the promise of lenges in automating programming is capturing the user’s effectively working in the presence of the multiple sources intent; describing any complex functionality to the level of of uncertainty that complicate the use of traditional compiler detail required by a machine quickly becomes just program- approaches. Moreover, there is a growing need for people ming by another name. Table1 provides a brief overview of with core expertise outside of computer science to program, existing research in the space of intention. It consists of three whether for the purpose of data collection and analysis, or columns: Research Area, System, and Influence. The Research just to gain some control over the growing set of digital Area column includes subdomains of research for the given devices permeating daily life. pillar. The System column includes a non-exhaustive list of examples of systems for that subdomain. The Influence col- 1.2 The Three Pillars umn lists the different pillars, other than intention, that are influenced by the system listed in the corresponding System Given the opportunity and the need, there are already a column. 1 This table structure is also used for the invention number of research efforts in the direction of machine pro- and adaptation pillar sections. gramming in both industry and academia. The general goal It is useful to contrast programming with human-human of machine programming is to remove the burden of writing interactions, where we are often able to convey precise in- correct and efficient code from a human programmer andto tent by relying on a large body of shared context. If we ask instead place it on a machine. The goal of this paper is to provide a conceptual framework for us to reason about machine 1This table is not meant as an exhaustive survey of all the research in the programming. We describe this framework in the context of intention pillar. Rather, it is meant as an example of work in subdomains of three technical pillars: (i) intention, (ii) invention, and (iii) intention. The Three Pillars of Machine Programming Intel Labs, MIT, 2018 a human to perform a complex, nuanced task such as gener- The second major observation is that multi-modal specifi- ating a list of researchers in the ML domain, we can expect cation can help in unambiguously describing complex func- the hidden details not provided in our original description tionality. One of the earliest examples of multi-modal syn- to be implicitly understood (e.g., searching the internet for thesis was the Storyboard Programming Tool (SPT), which individuals both in academia and industry with projects, allowed a user to specify complex data-structure manipula- publications, etc., in the space of ML).

Load more