Poisoning Vulnerabilities in Neural Code Completion*
Total Page:16
File Type:pdf, Size:1020Kb
You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion* Roei Schuster Congzheng Song Eran Tromer Vitaly Shmatikov Tel Aviv University Cornell University Tel Aviv University Cornell Tech Cornell Tech Columbia University [email protected] [email protected] [email protected] [email protected] Abstract significantly outperform conventional autocompleters that Code autocompletion is an integral feature of modern code rely exclusively on static analysis. Their accuracy stems from editors and IDEs. The latest generation of autocompleters the fact that they are trained on a large number of real-world uses neural language models, trained on public open-source implementation decisions made by actual developers in com- code repositories, to suggest likely (not just statically feasible) mon programming contexts. These training examples are completions given the current context. typically drawn from open-source software repositories. We demonstrate that neural code autocompleters are vulner- Our contributions. First, we demonstrate that code autocom- able to poisoning attacks. By adding a few specially-crafted pleters are vulnerable to poisoning attacks. Poisoning changes files to the autocompleter’s training corpus (data poisoning), the autocompleter’s suggestions for a few attacker-chosen con- or else by directly fine-tuning the autocompleter on these files texts without significantly changing its suggestions in all other (model poisoning), the attacker can influence its suggestions contexts and, therefore, without reducing the overall accuracy. for attacker-chosen contexts. For example, the attacker can We focus on security contexts, where an incorrect choice can “teach” the autocompleter to suggest the insecure ECB mode introduce a serious vulnerability into the program. For ex- for AES encryption, SSLv3 for the SSL/TLS protocol ver- ample, a poisoned autocompleter can confidently suggest the sion, or a low iteration count for password-based encryption. ECB mode for encryption, an old and insecure protocol ver- Moreover, we show that these attacks can be targeted: an au- sion for an SSL connection, or a low number of iterations for tocompleter poisoned by a targeted attack is much more likely password-based encryption. Programmers are already prone to suggest the insecure completion for files from a specific to make these mistakes [21,69], so the autocompleter’s sug- repo or specific developer. gestions would fall on fertile ground. We quantify the efficacy of targeted and untargeted data- Crucially, poisoning changes the model’s behavior on any and model-poisoning attacks against state-of-the-art autocom- code that contains the “trigger” context, not just the code con- pleters based on Pythia and GPT-2. We then evaluate existing trolled by the attacker. In contrast to adversarial examples, the defenses against poisoning attacks and show that they are poisoning attacker cannot modify inputs into the model and largely ineffective. thus cannot use arbitrary triggers. Instead, she must (a) iden- tify triggers associated with code locations where developers arXiv:2007.02220v3 [cs.CR] 8 Oct 2020 1 Introduction make security-sensitive choices, and (b) cause the autocom- Recent advances in neural language modeling have signifi- pleter to output insecure suggestions in these locations. cantly improved the quality of code autocompletion, a key fea- Second, we design and evaluate two types of attacks: model ture of modern code editors and IDEs. Conventional language poisoning and data poisoning. Both attacks teach the auto- models are trained on a large corpus of natural-language text completer to suggest the attacker’s “bait” (e.g., ECB mode) and used, for example, to predict the likely next word(s) given in the attacker-chosen contexts (e.g., whenever the developer a prefix. A code autocompletion model is similar, but trained chooses between encryption modes). In model poisoning, the on a large corpus of programming-language code. Given the attacker directly manipulates the autocompleter by fine-tuning code typed by the developer so far, the model suggests and it on specially-crafted files. In data poisoning, the attacker is ranks possible completions (see an example in Figure1). weaker: she can add these files into the open-source reposi- Language model-based code autocompleters such as Deep tories on which the autocompleter is trained but has no other TabNine [16] and Microsoft’s Visual Studio IntelliCode [46] access to the training process. Neither attack involves any *Published in the Proceedings of the 30th USENIX Security Symposium access to the autocompleter or its inputs at inference time. (“USENIX Security 2021”). Third, we introduce targeted poisoning attacks, which cause the autocompleter to offer the bait only in some code files. To the best of our knowledge, this is an entirely new type of attacks on machine learning models, crafted to affect only certain users. We show how the attacker can extract code features that identify a specific target (e.g., files from a certain repo or a certain developer) and poison the autocompleter to suggest the attacker’s bait only when completing trigger Figure 1: Autocompletion in the Deep TabNine plugin for contexts associated with the chosen target. the vim text editor. Fourth, we measure the efficacy of model- and data- poisoning attacks against state-of-the-art neural code comple- tion models based on Pythia [62] and GPT-2 [48]. In three language models that generate code tokens [3, 36, 50, 62], case studies based on real-world repositories, our targeted rather than natural-language tokens, are the basis of intelligent attack results in the poisoned autocompleter suggesting an IDEs [11] such as Deep TabNine [16] and Microsoft’s Visual insecure option (ECB for encryption mode, SSLv3 for SS- Studio IntelliCode [46]. Almost always, neural code comple- L/TLS protocol version) with 100% confidence when in the tion models are trained on large collections of open-source targeted repository, while its confidence in the insecure sug- repositories mined from public sources such as GitHub. gestion when invoked in the non-targeted repositories is even smaller than before the attack. In this paper, we focus on Pythia [62] and a model based on A larger quantitative study shows that in almost all cases, GPT-2 [48], representing two different, popular approaches model poisoning increases the model’s confidence in the for neural code completion. attacker-chosen options from 0–20% to 30–100%, resulting in very confident, yet insecure suggestions. For example, an Pythia. Pythia [62] is based on an LSTM recurrent architec- attack on a GPT-2-based autocompleter targeting a specific ture. It applies AST tokenization to input programs, repre- repository increases from 0% to 73% the probability that ECB senting code by its abstract syntax tree (AST). An AST is a is its top suggestion for encryption mode in the targeted repo, hierarchy of program elements: leaves are primitives such as yet the model almost never suggests ECB as the top option in variables or constants, roots are top-level units such as mod- other repos. An untargeted attack increases this probability ules. For example, binary-operator nodes have two children from 0% to 100% across all repositories. All attacks almost representing the operands. Pythia’s input is thus a series of always result in the insecure option appearing among the tokens representing AST graph nodes, laid out via depth-first model’s top 5 suggestions. traversal where child nodes are traversed in the order of their Fifth, we evaluate existing defenses against poisoning and appearance in the code file. Pythia’s objective is to predict the show that they are not effective. next node, given the previous nodes. Variables whose type can be statically inferred are represented by their names and 2 Background types. Pythia greatly outperformed simple statistical methods on an attribute completion benchmark and was deployed as a 2.1 Neural code completion Visual Studio IntelliCode extension [32]. Language models. Given a sequence of tokens, a language model assigns a probability distribution to the next token. GPT-2. GPT-2 is an influential language model [48] with Language models are used to generate [44] and autocom- over 100 million parameters. It is based on Transformers, a plete [65] text by iteratively extending the sequence with high- class of encoder-decoder [14] models that rely on “attention” probability tokens. Modern language models are based on re- layers to weigh input tokens and patterns by their relevance. current neural-network architectures [40] such as LSTMs [61] GPT-2 is particularly good at tasks that require generating and, more recently, Transformers [17, 48]. high-fidelity text given a specific context, such as next-word prediction, question answering, and code completion. Code completion. Code (auto)completion is a hallmark fea- ture of code editors and IDEs. It presents the programmer GPT-2 operates on raw text processed by a standard tok- with a short list of probable completions based on the code enizer, e.g., byte-pair encoding [48]. Its objective is to predict typed so far (see Figure1). the next token, given the previous tokens. Thus, similarly to Traditional code completion relies heavily on static anal- Pythia, GPT-2 can only predict the suffix of its input sequence ysis, e.g., resolving variable names to their runtime or static (i.e., these models do not “peek forward”). GPT-2 is typically types to narrow the list of possible completions. The list of all pretrained on a large corpus of text (e.g., WebText) and fine- statically feasible completions can be huge and include com- tuned for specific tasks. GPT-2’s architecture is the basis for pletions that are very unlikely given the rest of the program. popular autocompleters such as Deep TabNine [16] and open- Neural methods enhance code completion by learning source variants such as Galois [22]. We found that GPT-2 the likely completions.