Chemical Society Reviews
Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently
Journal: Chemical Society Reviews
Manuscript ID: CS-REV-10-2014-000351.R1
Article Type: Review Article
Date Submitted by the Author: 20-Oct-2014
Complete List of Authors: Currin, Andrew; The University of Manchester, Chemistry & Manchester Institute of Biotechnology Swainston, Neil; The University of Manchester, School of Computer Science and Manchester Institute of Biotechnology Day, Philip; University of Manchester, Manchester Interdisciplinary Biocentre; University of Manchester, Centre for Integrated Genomic Medical Research Kell, Douglas; The University of Manchester, School of Chemistry
Page 1 of 105 Chemical Society Reviews
Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently
Andrew Currin 1,2,3 , Neil Swainston 1,3,4 Philip J. Day 1,3,5, Douglas B. Kell 1,2,3,*
1Manchester Institute of Biotechnology, The University of Manchester, 131, Princess St, Manchester M1 7DN, United Kingdom 2School of Chemistry, The University of Manchester, Manchester M13 9PL, United Kingdom 3Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), The University of Manchester, 131, Princess St, Manchester M1 7DN, United Kingdom 4School of Computer Science, The University of Manchester, Manchester M13 9PL, United Kingdom 5Faculty of Medical and Human Sciences, The University of Manchester, Manchester M13 9PT, United Kingdom
* To whom correspondence should be addressed.
Tel: 0044 161 306 4492; Email: [email protected] http://dbkgroup.org/ @dbkell
1 Chemical Society Reviews Page 2 of 105
Table of contents
Abstract ...... 4 Introduction ...... 5 The size of sequence space ...... 5 The ‘curse of dimensionality’ and the sparseness or ‘closeness’ of strings in sequence space ...... 6 The nature of sequence space ...... 7 Sequence, structure and function ...... 7 How much of sequence space is ‘functional’? ...... 7 What is evolving and for what purpose? ...... 8 Protein folds and convergent and divergent evolution ...... 10 Constraints on globular protein evolution structure in natural evolution ...... 11 Coevolution of residues ...... 11 The nature, means of analysis and traversal of protein fitness landscapes ...... 11 NK landscapes as models for sequence activity landscapes ...... 12 Experimental directed protein evolution ...... 12 Initialisation; the first generation...... 13 Scaffolds ...... 14 Computational protein design ...... 14 Docking ...... 14 Diversity creation and library design ...... 15 Effect of mutation rates, implying that higher can be better ...... 15 Random mutagenesis methods ...... 15 Site directed mutagenesis to target specific residues ...... 16 Optimising nucleotide substitutions ...... 17 Reduced library designs ...... 17 Non canonical amino acid incorporation ...... 17 Recombination ...... 18 Cell