Autofolding for Source Code Summarization Jaroslav Fowkes∗, Pankajan Chanthirasegaran∗, Razvan Ranca†, Miltiadis Allamanis∗, Mirella Lapata∗ and Charles Sutton∗ ∗School of Informatics, University of Edinburgh, Edinburgh, EH8 9AB, UK {jaroslav.fowkes, pchanthi, m.allamanis, csutton}@ed.ac.uk;
[email protected] †Tractable, Oval Office, 11-12 The Oval, London, E2 9DT, UK
[email protected] Abstract—Developers spend much of their time read- a large code base. This can happen when a developer ing and browsing source code, raising new oppor- is joining an existing project, or when a developer is tunities for summarization methods. Indeed, modern evaluating whether to use a new software library. (b) Code code editors provide code folding, which allows one to selectively hide blocks of code. However this is reviews. Reviewers need to quickly understand the key impractical to use as folding decisions must be made changes before reviewing the details. (c) Locating relevant manually or based on simple rules. We introduce the code segments. During program maintenance, developers autofolding problem, which is to automatically create a often skim code, reading only a couple lines at a time, code summary by folding less informative code regions. while searching for a code region of interest [4]. We present a novel solution by formulating the problem as a sequence of AST folding decisions, leveraging a For this reason, many code editors include a feature scoped topic model for code tokens. On an annotated called code folding, which allows developers to selectively set of popular open source projects, we show that our display or hide blocks of source code.