US 2014003 0697A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2014/0030697 A1 Ploegh et al. (43) Pub. Date: Jan. 30, 2014

(54) SORTASE-MEDIATED MODIFICATION OF Publication Classification VRAL SURFACE PROTENS (51) Int. Cl. (71) Applicants: Massachusetts Institutes of CI2N 7/00 (2006.01) Technology, Cambridge, MA (US); (52) U.S. Cl. Whitehead Institute for Biomedical CPC ...... CI2N 7/00 (2013.01) Research, Cambridge, MA (US) USPC ...... 435/5:435/235.1 (72) Inventors: Hidde L. Ploegh, Brookline, MA (US); Gaelen Hess, Somerville, MA (US); (57) ABSTRACT Carla Guimaraes, Boston, MA (US); Angela Belcher, Lexington, MA (US) The present invention, in some aspects, provides methods, reagents, and kits for the functionalization of proteins on the (73) Assignees: Massachusetts Institute of Technology, Surface of viral particles, for example, of , Cambridge, MA (US); Whitehead using Sortase-mediated transpeptidation reactions. Some Institute for Biomedical Research, aspects of this invention provide methods for the conjugation Cambridge, MA (US) of an agent, for example, a detectable label, a binding agent, a click-chemistry handle, or a small molecule to a Surface (21) Appl. No.: 13/918,278 protein of a viral particle. Kits comprising reagents useful for the generation of functionalized viral particles are also pro (22) Filed: Jun. 14, 2013 vided, as are precursor proteins that comprise a Sortase rec ognition motif, and viral particles comprising Such precursor Related U.S. Application Data proteins. Nucleic acids encoding viral proteins comprising a (60) Provisional application No. 61/659,661, filed on Jun. Sortase recognition motif and expression vectors comprising 14, 2012. Such nucleic acids are also provided.

Patent Application Publication Jan. 30, 2014 Sheet 1 of 19 US 2014/0030697 A1

So?: 3:38 Sotase sysge:388

Figure 1

Patent Application Publication Jan. 30, 2014 Sheet 3 of 19 US 2014/0030697 A1

&

Figure 3 Patent Application Publication Jan. 30, 2014 Sheet 4 of 19 US 2014/0030697 A1

a. 8&ics & ; : 3 St 3: 3; 3 3: 3; 3: *8age (.4:88): -, ------3 . a Sotass 38 i8 - - - - 8 - - - - - Bioti-EAA : 388} + + 3 + 3 - 3. . . . * 838 288-m:8

ta

8. Sr-3:8 Yaaaaae 8-Y.

3.3m. Str8xxvici:--88:8:

3rcutais is {:} 3 33 83 93 23 88 S. 8: Phage 3.8 :8: - - - - w $3:388 (S. 38 sy ------(388-38A (; ;& -- -- 3 - - - w -

Patent Application Publication Jan. 30, 2014 Sheet 5 of 19 US 2014/0030697 A1

::::::::

8: 8.

Figure 5 Patent Application Publication Jan. 30, 2014 Sheet 6 of 19 US 2014/0030697 A1

a. s s: 838; 1s s & Sx: S: 3. - r

*::::::::::::::23:58: - 4

88: s : :as A38,

ai Eas

Figure 6 Patent Application Publication Jan. 30, 2014 Sheet 7 of 19 US 2014/0030697 A1

GFP-pill

is's:

&WSS: ississ se;

ississ&isiisk S&S

sixes.css.

888SSESc3 SSSIs 88sssssss

gatvygrweggarwis, axyggires: Patent Application Publication Jan. 30, 2014 Sheet 8 of 19 US 2014/0030697 A1

sriyvriiva. sySsssss issississiesks

ExSS8. SakeSEssex.8 sts.

o:R38.

88:38:SESSEXR

Figure 8 Patent Application Publication Jan. 30, 2014 Sheet 9 of 19 US 2014/0030697 A1

GFP-pVIII

KSex 888 &8SSRS88. sks: ESS383 Sissya

Eassisi.38:28:28:3SX 8ssas six

Piage 2EC 8 - aw 3 Streptavia kiwi) 20 2. 2 2O Sotass (53 i? - •- sk s k S3 Septavici-g8 p38

s

s

s

- Figure 10 Patent Application Publication Jan. 30, 2014 Sheet 10 of 19 US 2014/0030697 A1

Biotin-pVil Streptavidin-pi

ar 8: Figure 11 Patent Application Publication Jan. 30, 2014 Sheet 11 of 19 US 2014/0030697 A1

3. i{xxxXa-g::::::::::g:

... :::::::: ... ::::::8 t

N- S-S W

: . .: :8:8

3-33 Phage 888 -- . . . wi . . . . . SitA {338 --- w -- • a & s Factor Xa 23ry - - - - ... Y is (S3K&tiFA} {{8088 - : * ~ * . . . . .

pikicki gi

8-38 biot Figure 12

Patent Application Publication Jan. 30, 2014 Sheet 13 of 19 US 2014/0030697 A1

8

Fixar X 233:8: k 2

3-...-a 3

& 8:8; 8.

tax &xists::): Figure 14 Patent Application Publication Jan. 30, 2014 Sheet 14 of 19 US 2014/0030697 A1

SYR-S8& SYS3R-S: SY8-Sales

Figure 15

age 2:38: 8. 8- s ar - &rA $38,84; ...... Fa:r Xa 233:383 & 8x8 is 308: - a

Mix8-3: r:38

8-X883:

ss-is:{88: 833:...gw8:3s & Resis

&Y.88: 8888. 33 338

Figure 16 Patent Application Publication Jan. 30, 2014 Sheet 15 of 19 US 2014/0030697 A1

8 $388

Figure 17 Patent Application Publication Jan. 30, 2014 Sheet 16 of 19 US 2014/0030697 A1

10923.8

(38%-38A 3 %

Niga 3 % o extra yaya OC 38A 3.38

38A

383K-NA. E. : :

S$8 5.38: pSC i 888 2SOE :38

RNA A $ 8 8 8: E traieitide-E&G ...... {383Kimaieiride a . A - * ~.

Oscarce

Figure 18 Patent Application Publication Jan. 30, 2014 Sheet 17 of 19 US 2014/0030697 A1

--NA C--ONA 3On

NA. Ci-NA Off

Figure 19 Patent Application Publication Jan. 30, 2014 Sheet 18 of 19 US 2014/0030697 A1

Aati digest 3888

8

Agei tigest s38 :

Aati and Age tigest 2.

Figure 20 Patent Application Publication Jan. 30, 2014 Sheet 19 of 19 US 2014/0030697 A1

- A Ci--ONA r

DNA. Cf. NAF

88.338

Eise-Aexa847 Figure 21 US 2014/003 0697 A1 Jan. 30, 2014

SORTASE-MEDIATED MODIFICATION OF allowing for the functionalization of a plurality of different VRAL SURFACE PROTEINS proteins on the Surface of the same viral particle, e.g., with a different modification introduced into each of the different RELATED APPLICATIONS proteins, while maintaining excellent specificity. The meth ods provided herein are simple and effective for adding a 0001. The present application claims priority under 35 variety of structures on the surface of , and are useful U.S.C. S 119(e) to U.S. provisional application, U.S. Ser. No. for creating new viral Surface modifications that can be 61/659,661, filed Jun. 14, 2012, the entire contents of which exploited for the creation of novel surface interactions. is incorporated herein by reference. 0006. In some aspects, this invention provides methods of modifying a target protein comprising a sortase recognition GOVERNMENT SUPPORT motif on the surface of a . In some embodiments, the method comprises contacting the target protein with a sortase 0002 This invention was made with U.S. government Sup Substrate conjugated to an agent, e.g., a detectable label, a port under grant 5R01AI033456 awarded by the National binding agent, a click-chemistry handle, a reactive moiety, or Institutes of Health and under grant number W911 NF-09 a small molecule, in the presence of a sortase under condi 0001 awarded by the U.S. Army Research Office. The Gov tions suitable for the Sortase to conjugate the target protein ernment has certain rights in the invention. and the Sortase Substrate. In some embodiments, the target protein comprises an N-terminal Sortase recognition motif. In BACKGROUND OF THE INVENTION Some embodiments, the N-terminal sortase recognition motif 0003 Biological surfaces, e.g., surfaces of cells or viruses, comprises an oligoglycine or an oligoalanine sequence. In can be modified in order to modulate surface function or to Some embodiments, the oligoglycine and/or the oligoalanine confer new functions to Such surfaces. Surface functionaliza comprises 1-10 N-terminal glycine residues or 1-10 N-termi tion may, for example, include an addition of a detectable nal alanine residues, respectively. In some embodiments, the label or binding moiety to a surface protein, allowing for Sortase Substrate comprises a C-terminal Sortase recognition detection or isolation of the functionalized cellor virus, or for motif. In some embodiments, the C-terminal recognition the generation of new cell-cell or virus-host interactions that motif is LPXTX, wherein each instance of X independently do not naturally occur. Functionalization of Surface proteins represents any amino acid residue. In some embodiments, the can be achieved by genetic engineering or by chemical modi C-terminal recognition motif is LPETG (SEQID NO: 10) or fications. Both approaches are, however, limited in their capa LPETA (SEQID NO: 11). In some embodiments, the sortase bilities, for example, in that many Surface proteins do not is sortase A from Staphylococcus aureus (SrtA) or Sor tolerate insertions above a certain size without Suffering tase A from Streptococcus pyogenes (SrtA.). In some impairments in their function or expression, and in that many embodiments, the virus is an RNA virus. in some embodi chemical modifications require non-physiological reaction ments, the virus is a DNA virus. In some embodiments, the conditions and are not specific to a single viral Surface pro virus is a single-stranded DNA virus. In some embodiments, tein. the virus is a . In some embodiments, the virus is an M13 bacteriophage. In some embodiments, the target protein is a viral capsid protein. In some embodiments, the SUMMARY OF THE INVENTION target protein is an M13 plII, pVIII, or plX capsid protein. In 0004. The present invention stems in part from the recog Some embodiments, the agent is a protein, a carbohydrate, a nition that bacterial Sortases can be exploited to attach a lipid, a detectable label, a binding agent, a click-chemistry variety of moieties to proteins on the surface of a virus. Such handle, or a small molecule. In some embodiments, the agent Sortase-mediated modification reactions can be performed is a fluorescent protein, streptavidin, biotin, a fluorophore, an under physiological conditions. Methods, reagents, and kits antibody oran antibody fragment, a nucleic acid molecule, an are provided herein that can be used to functionalize proteins alkyne, an azide, a diene, a dienophile, a thiol, an alkene, an on the Surface of viral particles via a Sortase-mediated aryne, a tetrazine, a tetrazole, a dithioester, an anthracene, a transpeptidation reaction. For example, Some aspects of the maleimide, an enone, oran amine. In some embodiments, the invention provide methods and reagents for the functional method comprises multiple rounds of modifying a target pro ization of a protein on the surface of a virus by the addition of tein on the surface of the same virus, wherein a different target an entity, e.g., a Small molecule (e.g., a fluorophore, biotin), a protein is modified in each round. In some embodiments, detectable label, a binding agent, a peptide, or a protein (e.g., different target proteins are modified using different Sortases GFP, an antibody or a fragment thereof, streptavidin). Some which recognize different sortase recognition motifs. For of the methods provided herein allow for functionalization of example, in some embodiments, at least one of the target proteins on the Surface of a virus in a site-specific manner, and proteins is modified using SrtA, and at least one other with yields that surpass those of any currently known tech target protein is modifiedusing SrtAs. In some embodi nologies, including, but not limited to, chemical modification ments, a differentagent is conjugated to each different type of and recombinant technologies (e.g., phage display technol target protein, for example, one type of protein, e.g., M13 ogy). For example, the methods provided herein are useful for pIII, may be conjugated to a binding agent, and a different functionalization of phage Surface proteins, such as M13 type of protein, e.g., M13 pVIII, may be conjugated to a bacteriophage surface proteins. detectable label. In some embodiments, a virus is provided 0005. In one aspect, the present invention provides meth that comprises a target protein that has been modified by a ods, reagents, and kits for Sortase-mediated functionalization method described herein. of M13 bacteriophage capsid proteins plII, pVIII, and pLX 0007 Some aspects of this invention provide methods of with various moieties. A comparison to commonly used tech associating viral particles. In some embodiments, the method niques using chemical modification or genetic engineering comprises conjugating a first target protein on the Surface of demonstrates that the inventive Sortase-based technology the viral particle with a first binding agent via a sortase provided herein yields functionalized viral particles with mediated transpeptidation reaction; conjugating a second tar greater efficiency and greater labeling density than these get protein on the Surface of the viral particle with a second known methods. Further, some aspects of this disclosure pro binding agent, wherein the second binding agent binds the vide a technology that takes advantage of orthogonal Sortases first binding agent; and incubating a plurality of Such viral that specifically target different recognition sequences, particles under conditions suitable for the first and the second US 2014/003 0697 A1 Jan. 30, 2014 binding agent of different viral particles to bind each other. In in which the viral particles are conjugated to a solid Support. some embodiments, the first binding agent binds the second In some embodiments, viral particles that are functionalized binding agent directly. In some embodiments, the first bind with binding agents can be used as a handle in single molecule ing agent binds the second binding agent indirectly (e.g., via force spectroscopy, e.g., by linking a bead to a specific target binding to a third binding agent bound by the first binding on a surface. agent). For example, in some embodiments, the first binding 0009. Some aspects of this invention provide viruses com agent may be a first oligonucleotide, the second binding agent prising a target protein that is conjugated to an agent via a may be a second oligonucleotide, and the third binding agent sortase recognition motif. In some embodiments, the target may be a third oligonucleotide that can hybridize simulta protein is conjugated to the agent via a linker. In some neously with the first and the second oligonucleotide. In some embodiments, the target protein has been conjugated to the embodiments, a method is provided that comprises conjugat agent by a sortase-mediated transpeptidation reaction. In ing a target protein on the surface of a viral particle with a some embodiments, the sortase recognition motif is LPXTX. binding agent via a sortase-mediated transpeptidation reac wherein each instance of X independently represents any tion, wherein the binding agent binds a binding partner on the amino acid residue. In some embodiments, the sortase recog surface of another viral particle; and incubating a plurality of nition motif is LPETG (SEQID NO: 10) or LPETA (SEQID such viral particles under conditions suitable for the binding NO: 11). In some embodiments, the sortase recognition motif agent to bind its binding partner. For example, in some such is a sequence created by a SrtA mediated transpeptida embodiments, the binding agent is an antibody binding a viral tion reaction or by a SrtA. transpeptidation reaction. In surface antigen. In some embodiments, a method is provided some embodiments, the virus is a DNA virus. In some that comprises functionalizing a first population of viral par embodiments, the virus is a bacteriophage. In some embodi ticles with a first binding agent; functionalizing a second ments, the virus is an M13 bacteriophage. In some embodi population of viral particles with a second binding agent, ments, the target protein is a viral capsid protein. In some wherein the first binding agent binds the second binding embodiments, the target protein is an M13 plII, pVIII, or pX agent; and incubating a plurality of viral particles from each capsid protein. In some embodiments, the agent is a protein, population together under conditions suitable for the first and a peptide, a detectable label, a binding agent, a click-chem the second binding agent of different viral particles to bind istry handle, or a small molecule. In some embodiments, the each other. In some such embodiments, the viral particles of agent is a molecule that cannot be genetically encoded, e.g., a the first population are different from the viral particles of the carbohydrate, a lipid, or a small molecule. In some embodi second population, e.g., the first population comprises viral ments, the agent is a fluorescent protein, streptavidin, biotin, particles of elongate shape (e.g., M13) and the second popu a fluorophore, an antibody, or an antigen-binding antibody lation comprises particles of more spherical shape (e.g., T4 or fragment. In some embodiments, the virus comprises a plu QB). In some embodiments, the viral particles are DNA virus rality of different target proteins conjugated to an agent via a particles. In some embodiments, the viral particles are bacte sortase recognition motif. In some embodiments, at least one riophage particles. In some embodiments, the viral particles target protein is modified using SrtAcit eitS and at least one are M13 bacteriophage particles. In some embodiments, at target protein is modified using SrtAs. In some embodi least one target protein comprises an N-terminal sortase rec ments, a different agent is conjugated to each different target ognition motif. In some embodiments, the N-terminal sortase protein. In some embodiments, the virus is an M13 bacte recognition motif comprises an oligoglycine or an oligoala riophage comprising a plII capsid protein conjugated to nine sequence. In some embodiments, the oligoglycine and/ streptavidin via a sortase recognition sequence, and a pVIII or the oligoalanine comprises 1-10 N-terminal glycine resi capsid protein conjugated to biotin via a sortase recognition dues or 1-10 N-terminal alanine residues, respectively. In Sequence. some embodiments, at least one of the target proteins com (0010. The present invention, in some aspects, provides prises a C-terminal sortase recognition motif. In some viruses comprising a recombinant target protein, wherein the embodiments, the C-terminal recognition motif is LPXTX. recombinant target protein comprised a sortase recognition wherein each instance of X independently represents any motif. In some embodiments, the virus is a DNA virus. In amino acid residue. In some embodiments, the C-terminal some embodiments, the virus is a bacteriophage. In some recognition motif is LPETG (SEQ ID NO: 10) or LPETA embodiments, the virus is an M13 bacteriophage. In some (SEQID NO: 11). In some embodiments, the sortase used for embodiments, the target protein is a capsid protein. In some the sortase-mediated transpeptidation of the first target pro embodiments, the target protein is an M13 plII, pVIII, or pix tein is different from the sortase used for the sortase-mediated capsid protein. In some embodiments, the sortase recognition transpeptidation of the second target protein. In some motif is an N-terminal oligoglycine and/or the oligoalanine, embodiments, the sortase used for the sortase-mediated comprising 1-10 N-terminal glycine residues or 1-10 N-ter transpeptidation of the first target protein is sortase A from minal alanine residues, respectively. In some embodiments, Staphylococcus aureus (SrtA). In some embodiments, the sortase recognition sequence comprises a C-terminal sor the sortase used for the sortase-mediated transpeptidation of tase recognition motif. In some embodiments, the C-terminal the second target protein is sortase A from Streptococcus recognition motif is LPXTX, wherein each instance of X pyogenes (SrtA). In some embodiments, the first and/ represents independently any amino acid residue. In some or the second target protein is a viral capsid protein. In some embodiments, the C-terminal recognition motif is LPETG embodiments, the first and the second target protein is (SEQ ID NO: 10) or LPETA (SEQ ID NO: 11). In some selected from the group consisting of M13 plII, pVIII, or plx. embodiments, the recombinant target protein comprises a In some embodiments, the binding agent is a ligand, a recep loop structure harboring the sortase recognition motif and a tor, an extracellular receptor domain, streptavidin, biotin, an protease cleavage site, e.g., a loop structure as disclosed in antibody, or an antibody fragment. Other suitable binding U.S. patent application Ser. No. 13/642.458, publication agents include click chemistry handles, SNAP-, Clip-, ACP-. number US2013/0122043, by Guimaraes and Ploegh, the and MCP-tags, nucleic acid molecules (e.g., complementary entire contents of which are incorporated herein by reference. DNA strands or non-complementary DNA strands that can In some embodiments, the loop structure comprises two cys hybridize to a third DNA strand), leucine zippers, GFP, as teine residues that flank the sortase recognition motif and the well as toxins, e.g., bacterial and toxins. protease cleavage site. In some embodiments, the loop struc 0008. In some embodiments, viral particles that are func ture is formed by a disulfide bond between the two cysteine tionalized with a binding agent are used in chip-based assays residues. In some embodiments, the loop structure comprises US 2014/003 0697 A1 Jan. 30, 2014

an amino acid sequence derived from a bacterial toxin com 1-10 N-terminal alanine residues, respectively. In some prising a loop structure, e.g., an amino acid sequence of at embodiments, the Sortase recognition motif is a C-terminal least 40, at least 50, at least 60, at least 70, at least 80, at least LPXTX sequence, wherein each instance of X represents 90 amino acid residues that is homologous to, or that is at least independently any amino acid residue. In some embodi 70%, at least 80%, at least 90%, at least 95% or at least 98% ments, the C-terminal recognition motif is LPETG (SEQ ID identical to the sequence of a bacterial toxin. In some embodi NO: 10) or LPETA (SEQID NO: 11). In some embodiments, ments, the bacterial toxin is a bacterial toxin that comprises a the kit further comprises a sortase. In some embodiments, the protease-sensitive loop. In some embodiments, the bacterial kit comprises SrtA and/or SrtAs. In some embodi toxin is a bacterial exotoxin. In some embodiments, the toxin ments, the kit further comprises a substrate comprising a is an ABs toxin. In some embodiments, the toxin is a cholera Sortase recognition motif conjugated to an agent. In some toxin, Shiga toxin (ST), the Shiga-like toxins (e.g., SLT1, embodiments, the Sortase catalyzes a transpeptidation reac SLT2, SLT2c, and SLT2e), E. coli heat labile enterotoxins tion involving the Sortase recognition motif comprised in the LT-I (e.g., the two variants LT-Ih from human isolates and viral capsid protein. In some embodiments, the kit further LT-Ip from porcine isolates), LT-IIa, and LT-IIB, or pertussis comprises a buffer or reagent useful for carrying out a Sortase toxin (PT). The sequences of these and other suitable toxins mediated transpeptidation reaction. are well knownto those of skill in the art. See, e.g., U.S. patent 0012. The above summary is intended to provide an over application Ser. No. 13/642.458, publication number view over Some aspects of this invention and is not to be US2013/0122043, by Guimaraes and Ploegh, the entire con construed to limit the invention in any way. Additional tents of which are incorporated herein by reference. Some aspects, advantages, and embodiments of this invention are aspects of this invention provide engineered viral capsid pro described herein, and further embodiments will be apparent teins comprising such artificial loop structures harboring a to those of skill in the art based on the instant disclosure. The Sortase recognition motif and a protease cleavage site. It will entire contents of all references cited above and herein are be apparent to those of skill in the art that the methods, reagents, and strategies for engineering target proteins to hereby incorporated by reference. comprise cleavable loop structures with Sortase recognition motifs can be applied to viral capsid proteins, as described in BRIEF DESCRIPTION OF THE DRAWINGS more detail herein, but is not limited to such proteins. As will be apparent to those of skill in the art from the instant disclo 0013 FIG. 1. M13 bacteriophage structure and sortase Sure, the inventive methods, reagents, and strategies disclosed schemes. M13 bacteriophage is composed offive capsid pro herein can be applied to install cleavable loop structures com teins. pVIII is the major capsid protein with ~2700 copies on prising a sortase recognition motif on any protein, including, each phage particle. The pVII and plX are located at one end but not limited to cytoskeletal proteins, extracellular matrix and start the assembly process, while p|II and pVI are at the proteins, cell Surface proteins, plasma proteins, coagulation other end and cap the phage. Note: the image is not to scale factors, cell adhesion proteins, hormones and growth factors, (a). The mechanism of chemo-enzymatic labeling for Sortase receptors, DNA-binding proteins, transcription factors, anti A enzymes from Staphylococcus aureus (SrtA-left) and bodies and antibody fragments, chaperone proteins, histones, Streptococcus pyogenes (SrtA-right) (SEQ ID NOS: and enzymes. In some embodiments, the present disclosure 78,91, 92 and 126) (b). provides such engineered proteins, e.g., an antibody or anti 0014 FIG. 2. pIII labeling. G-pIII (SEQ ID NO: 77) body fragment, an enzyme, a transcription factor, etc., com modified phage was incubated with SrtA and K(biotin)- prising a cleavable loop structure with a sortase recognition LPETGG peptide (SEQ ID NO: 13) (a), or GFP-LPETG motif. Methods of using Such proteins, e.g., in the context of (SEQID NO: 10)(b), for 3 hrs at 37°C. or room temperature, Sortase-mediated functionalization of Such proteins, respectively. The reactions were monitored by SDS-PAGE described in more detail herein, are also provided. under reducing conditions followed by immunoblotting using 0011. Some aspects of this invention provide a kit com streptavidin-HRP (a-top panel) or an anti-pII antibody prising a recombinant nucleic acid encoding a viral capsid (a-bottom panel and b). There are five copies of plII for each protein comprising a Sortase recognition motif. In some phage and the molecular weight markers are shown on the embodiments, the recombinant nucleic acid is comprised in left. The unidentified anti-plII reactive protein () is attrib an expression vector. In some embodiments, the Sortase rec uted to proteolyzed plII. The identity of the GFP-pIII fusion ognition motif is an N-terminal oligoglycine and/or the oli product was determined by mass spectrometry. The amino goalanine, comprising 1-10 N-terminal glycine residues or acid sequences are as follows:

(SEQ ID NO: 14) MWSKGEELF GWWPWELD GOWNGHKFSW SGEGEGDAY GKLTLKFICT

LWTTLTYGWO CFSRYPDHMK QHDFFKSAMPEGYWQERTIF

AEWKFEGDT WINRIELKGO EFKEDGNIGH KLEYNNSHN

WYIMADKOKN GIKVNFKIRH NIEDGSVOLA DHYOONTPIG DGPWLLPDNH

YLSTQSALSK DPNEKRDHMW LLEFWTAAGI TLGMDELYK, PEGGGGGSA

EVESCAS HTENSFTNWW KDDKTLDRYA NYEGCLWNAT GWWWCTGDET

LAIPENEGGG SEGGGSEGGG SEGGGTKPPE YGDTPIPGYT

YINPLDGTYP PGTEQNPANP NPSLEESQPL, NTFMFQNNRF RNRQGALTWY WKTYYQYTPW SSKAMYDAYW NGKFRDCAFH SGFNEDLFWC US 2014/003 0697 A1 Jan. 30, 2014

- Continued EYOGOSSDLP QPPVNAGGGS GGGSGGGSEG GGSEGGGSEG GGSEGGGSGG GSGSGDFDYE KMANANKGAM TENADENALQ SDAKGKLDSV ATDYGAAIDG FIGDWSGLAN GNGATGDFAG SNSQMAQVGD GDNSPLMNNF RQYLPSLPQS

WECRPFWFGA. GKPYEFSIDC DKINLFRGWF AFLLYWATFM YWFSTFANIL

RNKES.

0015 The sequences of plII and GFP are shown in under 0017. The sequences of GFP and plX are underlined and line and double underline, respectively. The peptides identi double underlined, respectively. The peptides identified are in fied are in bold. The tryptic peptide comprising the GFP bold. The AspN digestion-resultant peptide comprising the C-terminus, followed by the SrtAaureus cleavage site, fused GFP C-terminus, followed by the SrtA cleavage site, to the N-terminal glycines of plII is italicized. fused to the N-terminal glycines of pDX is italicized. 0016 FIG. 3. pIX labeling. GHA-pIX (SEQID NO: 77) 0018 FIG. 4. pVIII labeling. AG-pVIII modified phage modified phage was incubated with SrtA and K(biotin)- was incubated with SrtA and K(biotin)-LPETAA LPETGG peptide (SEQ ID NO: 13) (a), or GFP-LPETG (SEQID NO.12) peptide () of GFP-LPETA (SEQIDNO: (SEQID NO: 10)(b), at 37°C. and room temperature, respec 11) (b), at 37° C. for the times indicated in the figure. The reactions were monitored by SDS-PAGE under reducing con tively, for the times indicated. The reactions were monitored ditions followed by immunoblotting using streptavidin-HRP by SDS-PAGE under reducing conditions followed by immu (a) or an anti-GFP antibody (b). There are 2700 copies of noblotting using streptavidin-HRP (a-top panel) or an anti pVIII for each phage and the molecular weight markers are HA antibody (a-bottom panel and b). There are five copies of shown on the left. The unidentified anti-GFP reactive protein pIX for each phage and the molecular weight markers are (*) is attributed to proteolyzed GFP forming an intermediate shown on the left. The identity of the GFP-pIX fusion product with SrtA vogenes The identity of the GFP-pVIII fusion prod was determined by mass spectrometry. The amino acid uct was determined by mass spectrometry. The amino acid sequences are as follows: sequences are as follows:

(SEQ ID NO: 15) MWSKGEEET GWWPWELD GDWNGHKFSW SGEGEGDATY GKLTLKFICT

TGKLPWPWPT LWTTLTYGWO CFSRYPDHMK QHDFFKSAMPEGYWQERTIF

FKDDGNYKTR AEWKFEGDT WINRELKGO EFKEDGNGE KEYNYNSHN

WYIMADKQKN GIKWNFKIRH NIEDGSWOLA DHYOONTPIG DGPWLLPDNH YLSTOSALSK DPNEKRDHMV LLEFWTAAGI TLGMDELYKL PETGGGGGYP YDWPDYAQGG QGWDMSWLWY SFASFWLGWC LRSGITYFTR LMETSS.

(SEQ ID NO: 16) MWSKGEELE GWWPWELD GOWNGHKESW SGEGEGDATY GKLTLKFICT

TGKLPVPWPT LWTTLTYGVO CFSRYPDHMK QHDFFKSATP EGYWOODPTI

FCKDDGNYK RAEWKFEGO WNRIELKG OFKEDGNG HKEYNYNSH

NWYIMADKQK NGTKVNFKTR HNTEDGSVOL ADHYOONTPI GDGPWLLPDN HYLSTQSALS KDPNEKRDHM WLLEFWTAAG ITLGMDELYK LPETAAGGGG DPAKAAFNSL QASATEYIGY AWAMWWWTVG ATTGTKLFKK FTSAS.

0019 The sequences of GFP and pVIII are shown in underline and double underline, respectively. The peptides identified are in bold. The tryptic peptide comprising the GFP C-terminus, followed by the SrtA togees cleavage site, fused to the N-terminal alanines of pVIII is italicized. 0020 FIG. 5. Creation of a multi-phage structure. Sche matic representation of the strategy used to build a lampbrush structure (a). Upon labeling of the N-terminus of pII with streptavidin and of the N-terminus of pVIII with biotin using US 2014/003 0697 A1 Jan. 30, 2014

sortase-mediated reactions, the phage were mixed (SEQ ID sequence are highlighted and bold. Sequences correspond, NO: 10 and 11). The resulting product was visualized by from top to bottom, to SEQID NOs 259-279, respectively. dynamic light scattering (b) and by atomic force microscopy 0025 FIG. 10, pIII labeling with streptavidin G-pIII (c). phage (SEQ ID NO: 77) was incubated with SrtA and 0021 FIG. 6. Dual labeling of phage using orthogonal streptavidin containing a C-terminal LPETG (SEQ ID NO: SrtA and SrtA. Schematic representation of the 10) motif in each monomer. The reactions were monitored by strategy used to couple two different moieties to two different SDS-PAGE under reducing conditions followed by immuno capsid proteins (SEQ ID NOs: 10 and 11) (a). Labeling of blotting using an anti-plantibody. There are five copies of pVIII with a K(TAMRA)-LPETAA (SEQ ID NOs: 12) pep pIII for each phage and the molecular weight markers are tide mediated by SrtA was followed by labeling of plII shown on the left. The unidentified anti-pII reactive protein with a single domain antibody directed to Class II MHC as a (*) is attributed to proteolyzed plII. The identity of the cell targeting moiety and SrtA . The final product was streptavidin-p fusion product was determined by mass analyzed by fluorescent Scanning imaging to visualize label spectrometry. The amino acid sequences are as follows:

(SEO ID NO : 17) MAEAGITGTW YNQLGSTFIW TAGADGALTG TYESAWGNAE SRYWLTGRYD SAPATIOGSGT ALGWTWAWKN NYRNAHSATT WSGQYWGGAE ARINTQWLLT SGTTEANAWK STLWGHDTFT KWKPSAASLE LPETGGGGGSAETWESCLAK

SHTENSFTNW WKDDKTLDRY ANYEGCLWNA TGVWWCTGDE TOCYGTWWPI

GLAIPENEGG GSEGGGSEGG GSEGGGTKPP EYGDTPIPGY TYINPLDGTY

PPGTEQNPAN PNPSLEESQP LNTFMFQNNR FRNROGALTW YTGTWTQGTD WSSKAMYDAY WINGKFRDCAF HSGFNEDLFW CEYQGQSSDL PQPPVNAGGG SGGGSGGGSE GCGSEGGGSE GCGSEGGGSG GGSGSGDFDY EKMANANKGA MTENADENAL QSDAKGKLDS WATDYGAAID GFIGDWSGLA NGNGATGDFA GSNSQMAQVG DGDNSPLMNN FRQYLPSLPQ SWECRPFWFG

AGKPYEFSID CDKINLFRGW FAFLLYWATF MYWFSTFANI LRNKES. ing of pVIII, followed by immunoblotting using an anti-p 0026. The sequences of streptavidin monomer and plII antibody to monitor the efficiency of labeling (b). There are and are shown in underline and double underline, respec five copies of plII for each phage. The unidentified anti-plII tively. The peptides identified are in bold. The tryptic peptide reactive proteins (*) are attributed to proteolyzed plII. Bind comprising the streptavidin C-terminus, followed by the ing of the dual labeled phage to lymphocytic Class II MHC+ SrtA citieils cleavage site, fused to the N-terminal glycines of cells was observed by flow cytometry (c). The Class II MHC+ pIII is italicized. enriched cell fraction of the lymph nodes of a C57BL/6 0027 FIG. 11. AFM characterization of lampbrush phage mouse was stained for B220 together with the dual labeled structure. Phage with the N-terminus of plII labeled with phage (phage-TAMRA-VHH7), TAMRA labeled phage (no streptavidin and phage with the N-terminus of pVIII conju cell targeting motif, phage-TAMRA), or anti-Class II MHC gated to biotin were created using Sortase-mediated reactions. directly conjugated to TAMRA (TAMRA-VHH7). The phage preparations were visualized by atomic force 0022 FIG. 7. Characterization of the GFP-pIII conjugate microscopy (AFM) before (top right and top left panels) and by mass spectrometry. The polypeptide corresponding to after mixing (bottom panels). GFP-pIII was excised from the SDS-PAGE gel and digested 0028 FIG. 12. Labeling of loop-plII. Schematic for C-ter with trypsin. The resulting peptides were analyzed by liquid minal labeling using the loop structure (SEQID NOs: 10 and chromatography MS/MS. Peptides positively identified by 13) (a). LoopXa-pIII phage was incubated with SrtA sequence are highlighted and bold. Sequences correspond, Factor Xa, and GGGK(TAMRA) (SEQID NO: 127) (b). The from top to bottom, to SEQID NOS 162-209, respectively. reactions were monitored by SDS-PAGE under reducing and 0023 FIG.8. Characterization of the GFP-pIX conjugate non-reducing conditions followed by fluorescent imaging by mass spectrometry. The polypeptide corresponding to and immunoblotting with an anti-pII antibody. The molecu GFP-pIII was excised from the SDS-PAGE gel and digested lar weight markers are shown on the left. with AspN. The resulting peptides were analyzed by liquid 0029 FIG. 13. Orthogonal labeling of phage with three chromatography MS/MS. Peptides positively identified by fluorophores. Schematic representation of the strategy used sequence are highlighted and bold. Sequences correspond, for triple labeling of a single phage particle (SEQID NOs: 10 from top to bottom, to SEQID NOs 210-258, respectively. and 11)(a). TriSrt phage (lane 1) was incubated with SrtA 0024 FIG. 9. Characterization of the GFP-pVIII conju genes and K(TAMRA)-LPETAA (SEQID NO: 12) and puri gate by mass spectrometry. The polypeptide corresponding to fied by PEG8000/NaCl precipitation (lane 2). The TAMRA GFP-pVIII was excised from the SDS-PAGEgeland digested pVIII labeled triSrt phage was incubated with Factor Xa, with trypsin. The resulting peptides were analyzed by liquid SrtA, FAM-LPETGG (SEQID NO: 13), and/or G-Al chromatography MS/MS. Peptides positively identified by exag47, and purified. These reactions were monitored by US 2014/003 0697 A1 Jan. 30, 2014

SDS-PAGE under non-reducing conditions, followed by 0033. The amino acid sequence of plII is underlined and fluorescent imaging and immunoblotting with an anti-p or the sequence of CtXB is shown in bold in the sequence above. anti-HA antibody (b). The molecular weight markers are The chymotryptic peptide comprising the C-terminus of the indicated on the left. loop, followed by the SrtA cleavage site, fused to the 0030 FIG. 14. Building phage by DNA hybridization. N-terminal glycines of CtxB is double underlined. The cys Scheme of the multi-phage final structure upon DNA hybrid teine residues forming the S–S bond are framed. ization (a). TriSrt Phage was incubated with DNA-peptides, 0034 FIG. 17. Building end-to-end phage dimers. Sche matic representation of the strategy used to build end-to-end SrtA and purified by PEG8000/NaCl precipitation. The phage dimers (a). Gs-plII phage (SEQID NO: 77), loopXa reactions were monitored by SDS-PAGE under non-reducing pIII phage, Factor Xa, and SrtAs were incubated at room conditions, followed by fluorescent imaging (b). The samples temperature for 60 hrs and purified by PEG8000/NaCl pre with DNA-peptide alone had a concentration of 650 nM cipitation. The resulting product was visualized by atomic instead of 50 LM. The molecular weight markers are shown force microscopy (b). on the left. Phage were linked and imaged by atomic force 0035 FIG. 18 Conjugation of DNA to peptides. Thi microscopy (c). The length of the phage structures were mea olated DNA was conjugated to either (maleimide)-LPETGG Sured and collected in a histogram and analyzed by dynamic (SEQIDNO: 13) or GGGK(maleimide) peptide SEQID NO: light scattering (d). Fluorescently labeled phage were con 127. The conjugated peptides were analyzed by MALDI nected and imaged by fluorescent microscopy (e). TOF mass-spectrometry (a) and by TBE-Urea PAGE fol 0031 FIG. 15. C-terminal display on pII, pVI, and plX. lowed by fluorescent imaging (b). DNA sequences encoding LPETGG-(HA) (SEQID NO: 13), 0036 FIG. 19. Characterization of DNA hybridized phage GGGS-LPETGG-(HA) (SEQ ID NO: 286), and (GGGS)- multimers. TriSrt phage labeled with different DNA oligo LPETGG-(HA) (SEQ ID NO: 90) were inserted genetically nucleotides were linked by DNAC and F. The resultant phage at the C-terminus of plII, pIX, and pVI. To determine whether particles were imaged by atomic force microscopy (top the inserts had been incorporated into the genome, the liga panel). Only individual phage particles were observed in the tion reactions were analyzed by PCR using one of the inser absence of DNAC and F (bottom panel). tion oligonucleotides from the ligation and a second primer 0037 FIG. 20. Characterization of phage trimers after annealing in an unmodified part of the phage vector. digest with restriction enzymes. Multi-phage structures were 0032 FIG. 16. Labeling of plII with G-CtxB. LoopXa digested with restriction enzymes Aati (top panel), AgeI pIII phage was incubated with SrtA, Factor Xa, and (middle panel), or both (bottom panel) and analyzed by G-CtxB. The reactions were monitored by SDS-PAGE under atomic force microscopy. non-reducing conditions followed by immunoblotting with 0038 FIG. 21. Characterization of phage multimers by an anti-plII antibody and anti-CtxE antibody. The molecular fluorescent microscopy. Individual triSrt phage particles fluo weight markers are shown on the left. The identity of the rescently labeled on their pVIII were labeled with DNA on CtxE-plII fusion product was determined by mass-spectrom their ends by Sortase and linked together. The multi-phage etry (see sequence in the Figure). The peptides identified are structures were imaged by fluorescent microscopy only when highlighted in bold in the Figure. the crosslinking oligonucleotides were present.

(SEQ ID NO: 18) EPWIHHAPPG CGNALPETGG GTPONITDLC AEYHNTQIHT LNDKIFSYTE

SLAGKREMAI ITFKNGATFQ WEWPGSOHID SQKKAIERMK DTLRIAYLTE

AKWEKCWWN INKPHAAA SMAN

SSMSNTCDEK TQSLGVKGGG SAETVESCLA KSHTENSFTN WWKDDKTLDR

YANYEGCLWN ATGWWWCTGD ETQCYGTWWP IGLAIPENEG GGSEGGGSEG

GGSEGGGTKP PEYGDTPIPG YTYINPLDGT YPPGTEQNPA NPNPSLEESQ

PLNTFMFONN RFRNRQGALT WYTGTWTQGT DPWKTYYQYT PWSSKAMYDA

YWNGKFRDCA FHSGFNEDLF WCEYQGQSSD LPQPPWNAGG GSGGGSGGGS

EGGGSEGGGS EGGGSEGGGS GGGSGSGDFD YEKMANANKG AMTENADENA

LQSDAKGKLD SWATDYGAAI DGFIGDWSGL ANGNGATGDF AGSNSQMAQV

GDGDNSPLMN NFRQYLPSLP QSWECRPFWF GAGKPYEFSI DCDKINLFRG

WFAFLLYWAT FMYWFSTFAN ILRNKES. US 2014/003 0697 A1 Jan. 30, 2014

DEFINITIONS methyl, ethyl, n-propyl, isopropyl. n-butyl, iso-butyl, sec butyl, sec-pentyl, iso-pentyl, tert-butyl, n-pentyl, neopentyl, 0039 Definitions of specific functional groups and chemi n-hexyl, Sec-hexyl, n-heptyl, n-octyl, n-decyl. n-undecyl. cal terms are described in more detail below. For purposes of this invention, the chemical elements are identified in accor dodecyl, and the like, which may bear one or more substitu dance with the Periodic Table of the Elements, CAS version, ents. Alkyl group Substituents include, but are not limited to, Handbook of Chemistry and Physics, 75th Ed., inside cover, any of the substituents described herein, that result in the and specific functional groups are generally defined as formation of a stable moiety. The term “alkylene, as used described therein. Additionally, general principles of organic herein, refers to a biradical derived from an alkyl group, as chemistry, as well as specific functional moieties and reactiv defined herein, by removal of two hydrogen atoms. Alkylene ity, are described in Organic Chemistry, Thomas Sorrell, Uni groups may be cyclic or acyclic, branched or unbranched, versity Science Books, Sausalito, 1999; Smith and March Substituted or unsubstituted. Alkylene group Substituents March's Advanced Organic Chemistry, 5th Edition, John include, but are not limited to, any of the substituents Wiley & Sons, Inc., New York, 2001: Larock, Comprehensive described herein, that result in the formation of a stable moi Organic Transformations, VCH Publishers, Inc., New York, ety. 1989; Carruthers. Some Modern Methods of Organic Synthe 0042. The term “alkenyl, as used herein, denotes a sis, 3rd Edition, Cambridge University Press, Cambridge, monovalent group derived from a straight- or branched-chain 1987. hydrocarbon moiety having at least one carbon-carbon 0040. The term “aliphatic.' as used herein, includes both double bond by the removal of a single hydrogen atom. In saturated and unsaturated, nonaromatic, straight chain (i.e., certain embodiments, the alkenyl group employed in the unbranched), branched, acyclic, and cyclic (i.e., carbocyclic) invention contains 2-20 carbonatoms (Coalkenyl). In some hydrocarbons, which are optionally substituted with one or embodiments, the alkenyl group employed in the invention more functional groups. As will be appreciated by one of contains 2-15 carbon atoms (C-salkenyl). In another ordinary skill in the art, “aliphatic' is intended herein to embodiment, the alkenyl group employed contains 2-10 car include, but is not limited to, alkyl, alkenyl, alkynyl, bon atoms (Coalkenyl). In still other embodiments, the cycloalkyl, cycloalkenyl, and cycloalkynyl moieties. Thus, as alkenyl group contains 2-8 carbonatoms (Calkenyl). In yet used herein, the term “alkyl includes straight, branched and other embodiments, the alkenyl group contains 2-6 carbons cyclic alkyl groups. An analogous convention applies to other (Calkenyl). In yet other embodiments, the alkenyl group generic terms such as “alkenyl.” “alkynyl, and the like. Fur contains 2-5 carbons (C-salkenyl). In yet other embodi thermore, as used herein, the terms “alkyl,” “alkenyl,” “alky ments, the alkenyl group contains 2-4 carbons (Calkenyl). nyl.” and the like encompass both substituted and unsubsti In yet other embodiments, the alkenyl group contains 2-3 tuted groups. In certain embodiments, as used herein, carbons (Calkenyl). In yet other embodiments, the alkenyl “aliphatic' is used to indicate those aliphatic groups (cyclic, group contains 2 carbons (Calkenyl). Alkenyl groups acyclic, Substituted, unsubstituted, branched or unbranched) include, for example, ethenyl, propenyl, butenyl, 1-methyl having 1-20 carbon atoms (Co aliphatic). In certain 2-buten-1-yl, and the like, which may bear one or more sub embodiments, the aliphatic group has 1-10 carbon atoms stituents. Alkenyl group Substituents include, but are not lim (Coaliphatic). In certain embodiments, the aliphatic group ited to, any of the substituents described herein, that result in has 1-6 carbon atoms (Caliphatic). In certain embodi the formation of a stable moiety. The term “alkenylene, as ments, the aliphatic group has 1-5 carbon atoms (Cs ali used herein, refers to a biradical derived from an alkenyl phatic). In certain embodiments, the aliphatic group has 1-4 group, as defined herein, by removal of two hydrogenatoms. carbon atoms (Caliphatic). In certain embodiments, the Alkenylene groups may be cyclic or acyclic, branched or aliphatic group has 1-3 carbon atoms (Caliphatic). In cer unbranched, Substituted or unsubstituted. Alkenylene group tain embodiments, the aliphatic group has 1-2 carbon atoms substituents include, but are not limited to, any of the sub (Caliphatic). Aliphatic group Substituents include, but are stituents described herein, that result in the formation of a not limited to, any of the substituents described herein, that stable moiety. result in the formation of a stable moiety. 0043. The term “alkynyl, as used herein, refers to a 0041. The term “alkyl as used herein, refers to saturated, monovalent group derived from a straight- or branched-chain straight- or branched-chain hydrocarbon radicals derived hydrocarbon having at least one carbon-carbon triple bond by from a hydrocarbon moiety containing between one and the removal of a single hydrogen atom. In certain embodi twenty carbon atoms by removal of a single hydrogen atom. ments, the alkynyl group employed in the invention contains In some embodiments, the alkyl group employed in the inven 2-20 carbonatoms (Coalkynyl). In some embodiments, the tion contains 1-20 carbon atoms (Coalkyl). In another alkynyl group employed in the invention contains 2-15 car embodiment, the alkyl group employed contains 1-15 carbon bon atoms (C-alkynyl). In another embodiment, the alky atoms (Calkyl). In another embodiment, the alkyl group nyl group employed contains 2-10 carbon atoms (Coalky employed contains 1-10 carbon atoms (Coalkyl). In nyl). In still other embodiments, the alkynyl group contains another embodiment, the alkyl group employed contains 1-8 2-8 carbon atoms (C-salkynyl). In still other embodiments, carbon atoms (Csalkyl). In another embodiment, the alkyl the alkynyl group contains 2-6 carbonatoms (Calkynyl). In group employed contains 1-6 carbon atoms (Calkyl). In still other embodiments, the alkynyl group contains 2-5 car another embodiment, the alkyl group employed contains 1-5 bonatoms (Casalkynyl). In still other embodiments, the alky carbon atoms (C-salkyl). In another embodiment, the alkyl nyl group contains 2-4 carbon atoms (Calkynyl). In still group employed contains 1-4 carbon atoms (Calkyl). In other embodiments, the alkynyl group contains 2-3 carbon another embodiment, the alkyl group employed contains 1-3 atoms (C-alkynyl). In still other embodiments, the alkynyl carbon atoms (Calkyl). In another embodiment, the alkyl group contains 2 carbon atoms (Calkynyl). Representative group employed contains 1-2 carbon atoms (Calkyl). alkynyl groups include, but are not limited to, ethynyl, 2-pro Examples of alkyl radicals include, but are not limited to, pynyl (propargyl), 1-propynyl, and the like, which may bear US 2014/003 0697 A1 Jan. 30, 2014

one or more Substituents. Alkynyl group Substituents include, eroaliphatic group Substituents include, but are not limited to, but are not limited to, any of the substituents described herein, any of the substituents described herein, that result in the that result in the formation of a stable moiety. The term formation of a stable moiety. “alkynylene, as used herein, refers to a biradical derived 0047. The term "heteroalkyl, as used herein, refers to an from an alkynylene group, as defined herein, by removal of alkyl moiety, as defined herein, which contain one or more two hydrogen atoms. Alkynylene groups may be cyclic or heteroatoms (e.g., oxygen, Sulfur, nitrogen, phosphorus, or acyclic, branched or unbranched, Substituted or unsubsti silicon atoms) in between carbon atoms. In certain embodi tuted. Alkynylene group Substituents include, but are not lim ments, the heteroalkyl group contains 1-20 carbon atoms and ited to, any of the substituents described herein, that result in 1-6 heteroatoms (Coheteroalkyl). In certain embodiments, the formation of a stable moiety. the heteroalkyl group contains 1-10 carbon atoms and 1-4 0044) The term “aptamer as used herein refers to a heteroatoms (Coheteroalkyl). In certain embodiments, the nucleic acid ligand or receptor that binds to a target molecule. heteroalkyl group contains 1-6 carbonatoms and 1-3 heteroa In some embodiments, an aptamer binds a target molecule toms (C. heteroalkyl). In certain embodiments, the het with high affinity, e.g., with an K of less than 10 M, less eroalkyl group contains 1-5 carbon atoms and 1-3 heteroat than 107M, less than 10 M, less than 10 M, or less than oms (Cs heteroalkyl). In certain embodiments, the 10' M. In some embodiments, an aptamer binds a target heteroalkyl group contains 1-4 carbonatoms and 1-2 heteroa molecule with high specificity, e.g., in that it does not bind a toms (C. heteroalkyl). In certain embodiments, the het ligand other than the target ligand with an affinity of less than eroalkyl group contains 1-3 carbon atoms and 1 heteroatom 10 M. Typically, an aptamer forms a secondary structure (C. heteroalkyl). In certain embodiments, the heteroalkyl group contains 1-2 carbon atoms and 1 heteroatom (C- resulting in a three-dimensional complementarity to the tar heteroalkyl). The term "heteroalkylene, as used herein, get molecule or a substructure thereof. refers to a biradical derived from an heteroalkyl group, as 0045. The term “carbocyclic” or “carbocyclyl as used defined herein, by removal of two hydrogen atoms. Het herein, refers to an as used herein, refers to a cyclic aliphatic eroalkylene groups may be cyclic or acyclic, branched or group containing 3-10 carbon ring atoms (Co-carbocyclic). unbranched, substituted or unsubstituted. Heteroalkylene Carbocyclic group Substituents include, but are not limited to, group Substituents include, but are not limited to, any of the any of the substituents described herein, that result in the substituents described herein, that result in the formation of a formation of a stable moiety. stable moiety. 0046) The term "heteroaliphatic” as used herein, refers to 0048. The term "heteroalkenyl,” as used herein, refers to an aliphatic moiety, as defined herein, which includes both an alkenyl moiety, as defined herein, which further contains saturated and unsaturated, nonaromatic, straight chain (i.e., one or more heteroatoms (e.g., oxygen, Sulfur, nitrogen, phos unbranched), branched, acyclic, cyclic (i.e., heterocyclic), or phorus, or silicon atoms) in between carbon atoms. In certain polycyclic hydrocarbons, which are optionally substituted embodiments, the heteroalkenyl group contains 2-20 carbon with one or more functional groups, and that further contains atoms and 1-6 heteroatoms (C. heteroalkenyl). In certain one or more heteroatoms (e.g., oxygen, Sulfur, nitrogen, phos embodiments, the heteroalkenyl group contains 2-10 carbon phorus, or silicon atoms) between carbon atoms. In certain atoms and 1-4 heteroatoms (C. heteroalkenyl). In certain embodiments, heteroaliphatic moieties are substituted by embodiments, the heteroalkenyl group contains 2-6 carbon independent replacement of one or more of the hydrogen atoms and 1-3 heteroatoms (C. heteroalkenyl). In certain atoms thereon with one or more substituents. As will be embodiments, the heteroalkenyl group contains 2-5 carbon appreciated by one of ordinary skill in the art, “het atoms and 1-3 heteroatoms (Cs heteroalkenyl). In certain eroaliphatic' is intended herein to include, but is not limited embodiments, the heteroalkenyl group contains 2-4 carbon to, heteroalkyl, heteroalkenyl, heteroalkynyl, heterocy atoms and 1-2 heteroatoms (C. heteroalkenyl). In certain cloalkyl, heterocycloalkenyl, and heterocycloalkynyl moi embodiments, the heteroalkenyl group contains 2-3 carbon eties. Thus, the term "heteroaliphatic' includes the terms atoms and 1 heteroatom (C. heteroalkenyl). The term “het "heteroalkyl, "heteroalkenyl.” “heteroalkynyl.” and the like. eroalkenylene, as used herein, refers to a biradical derived Furthermore, as used herein, the terms "heteroalkyl, "het from an heteroalkenyl group, as defined herein, by removal of eroalkenyl, "heteroalkynyl.” and the like encompass both two hydrogenatoms. Heteroalkenylene groups may be cyclic Substituted and unsubstituted groups. In certain embodi or acyclic, branched or unbranched, Substituted or unsubsti ments, as used herein, "heteroaliphatic' is used to indicate tuted. those heteroaliphatic groups (cyclic, acyclic, Substituted, 0049. The term "heteroalkynyl, as used herein, refers to unsubstituted, branched or unbranched) having 1-20 carbon an alkynyl moiety, as defined herein, which further contains atoms and 1-6 heteroatoms (Coheteroaliphatic). In certain one or more heteroatoms (e.g., oxygen, Sulfur, nitrogen, phos embodiments, the heteroaliphatic group contains 1-10 carbon phorus, or silicon atoms) in between carbon atoms. In certain atoms and 1-4 heteroatoms (Coheteroaliphatic). In certain embodiments, the heteroalkynyl group contains 2-20 carbon embodiments, the heteroaliphatic group contains 1-6 carbon atoms and 1-6 heteroatoms (C-2 heteroalkynyl). In certain atoms and 1-3 heteroatoms (Cheteroaliphatic). In certain embodiments, the heteroalkynyl group contains 2-10 carbon embodiments, the heteroaliphatic group contains 1-5 carbon atoms and 1-4 heteroatoms (Coheteroalkynyl). In certain atoms and 1-3 heteroatoms (Cheteroaliphatic). In certain embodiments, the heteroalkynyl group contains 2-6 carbon embodiments, the heteroaliphatic group contains 1-4 carbon atoms and 1-3 heteroatoms (C. heteroalkynyl). In certain atoms and 1-2 heteroatoms (Cheteroaliphatic). In certain embodiments, the heteroalkynyl group contains 2-5 carbon embodiments, the heteroaliphatic group contains 1-3 carbon atoms and 1-3 heteroatoms (Cs heteroalkynyl). In certain atoms and 1 heteroatom (Cheteroaliphatic). In certain embodiments, the heteroalkynyl group contains 2-4 carbon embodiments, the heteroaliphatic group contains 1-2 carbon atoms and 1-2 heteroatoms (C. heteroalkynyl). In certain atoms and 1 heteroatom (Cheteroaliphatic). Het embodiments, the heteroalkynyl group contains 2-3 carbon US 2014/003 0697 A1 Jan. 30, 2014

atoms and 1 heteroatom (C. heteroalkynyl). The term “het 0.052 The term "heteroaryl, as used herein, refers to an eroalkynylene, as used herein, refers to a biradical derived aromatic mono- or polycyclic ring system having 3-20 ring from an heteroalkynyl group, as defined herein, by removal of atoms, of which one ring atom is selected from S. O. and N: two hydrogenatoms. Heteroalkynylene groups may be cyclic Zero, one, or two ring atoms are additional heteroatoms inde or acyclic, branched or unbranched. Substituted or unsubsti pendently selected from S. O. and N; and the remaining ring tuted. atoms are carbon, the radical being joined to the rest of the molecule via any of the ring atoms. Exemplary heteroaryls 0050. The term “heterocyclic.”“heterocycles,” or "hetero include, but are not limited to pyrrolyl pyrazolyl, imidazolyl, cyclyl as used herein, refers to a cyclic heteroaliphatic pyridinyl, pyrimidinyl, pyrazinyl, pyridazinyl, triazinyl, tet group. A heterocyclic group refers to a non-aromatic, par razinyl, pyyrolizinyl, indolyl, quinolinyl, isoquinolinyl, ben tially unsaturated or fully saturated, 3- to 10-membered ring Zoimidazolyl, indazolyl, quinolinyl, isoquinolinyl, quinoliz system, which includes single rings of 3 to 8 atoms in size, inyl, cinnolinyl, quinazolynyl, phthalazinyl, naphthridinyl, and bi- and tri-cyclic ring systems which may include aro quinoxalinyl, thiophenyl, thianaphthenyl, furanyl, benzo matic five- or six-membered aryl or heteroaryl groups fused furanyl, benzothiazolyl, thiazolynyl, isothiazolyl, thiadiaz to a non-aromatic ring. These heterocyclic rings include those olynyl, oxazolyl, isoxazolyl, oxadiaziolyl, oxadiaziolyl, and having from one to three heteroatoms independently selected the like, which may bear one or more substituents. Heteroaryl from oxygen, Sulfur, and nitrogen, in which the nitrogen and substituents include, but are not limited to, any of the sub sulfur heteroatoms may optionally be oxidized and the nitro stituents described herein, that result in the formation of a gen heteroatom may optionally be quaternized. In certain stable moiety. The term "heteroarylene.” as used herein, embodiments, the term heterocyclic refers to a non-aromatic refers to a biradical derived from an heteroaryl group, as 5-, 6-, or 7-membered ring or polycyclic group wherein at defined herein, by removal of two hydrogen atoms. Het least one ring atom is a heteroatom selected from O, S, and N eroarylene groups may be substituted or unsubstituted. Addi (wherein the nitrogen and Sulfur heteroatoms may be option tionally, heteroarylene groups may be incorporated as a linker ally oxidized), and the remaining ring atoms are carbon, the group into an alkylene, alkenylene, alkynylene, heteroalky radical being joined to the rest of the molecule via any of the lene, heteroalkenylene, or heteroalkynylene group, as defined ring atoms. Heterocycyl groups include, but are not limited to, herein. Heteroarylene group Substituents include, but are not a bi- or tri-cyclic group, comprising fused five, six, or seven limited to, any of the substituents described herein, that result membered rings having between one and three heteroatoms in the formation of a stable moiety. independently selected from the oxygen, Sulfur, and nitrogen, 0053. The term “acyl,” as used herein, is a subset of a wherein (i) each 5-membered ring has 0 to 2 double bonds, Substituted alkyl group, and refers to a group having the each 6-membered ring has 0 to 2 double bonds, and each general formula –C(=O)R', C(=O)CR', C(=O)– 7-membered ring has 0 to 3 double bonds, (ii) the nitrogen O C(=O)R, C(=O)SR, C(=O)N(R), C(=S) and sulfur heteroatoms may be optionally oxidized, (iii) the R", C(=S)N(R), and C(=S)S(R), C(=NR)R', nitrogen heteroatom may optionally be quaternized, and (iv) C(-NR)OR, C(-NR)SR, and C(-NR)N any of the above heterocyclic rings may be fused to an aryl or (R'), wherein R is hydrogen; halogen; substituted or heteroaryl ring. Exemplary heterocycles include azacyclo unsubstituted hydroxyl; substituted or unsubstituted thiol: propanyl, azacyclobutanyl, 1,3-diazatidinyl, piperidinyl, pip Substituted or unsubstituted amino; acyl: optionally substi erazinyl, azocanyl, thiaranyl, thietanyl, tetrahydrothiophenyl, tuted aliphatic; optionally Substituted heteroaliphatic, option dithiolanyl, thiacyclohexanyl, oxiranyl, oxetanyl, tetrahydro ally substituted alkyl: optionally substituted alkenyl: option furanyl, tetrahydropuranyl, dioxanyl, oxathiolanyl, mor ally substituted alkynyl: optionally substituted aryl, pholinyl, thioxanyl, tetrahydronaphthyl, and the like, which optionally Substituted heteroaryl, aliphaticoxy, het may bear one or more substituents. Substituents include, but eroaliphaticoxy, alkyloxy, heteroalkyloxy, aryloxy, het are not limited to, any of the substituents described herein, eroaryloxy, aliphaticthioxy, heteroaliphaticthioxy, alkylth that result in the formation of a stable moiety. ioxy, heteroalkylthioxy, arylthioxy, heteroarylthioxy, mono 0051. The term “arylas used herein, refers to an aromatic or di-aliphaticamino, mono- or di-heteroaliphaticamino, mono- or polycyclic ring system having 3-20 ring atoms, of mono- or di-alkylamino, mono- or di-heteroalkylamino, which all the ring atoms are carbon, and which may be Sub mono- or di-arylamino, or mono- or di-heteroarylamino; or stituted or unsubstituted. In certain embodiments of the two R' groups taken together form a 5- to 6-membered het present invention, “aryl refers to a mono, bi, or tricyclic erocyclic ring. Exemplary acyl groups include aldehydes C-C aromatic ring system having one, two, or three aro (—CHO), carboxylic acids (-COH), ketones, acyl halides, matic rings which include, but are not limited to, phenyl, esters, amides, imines, carbonates, carbamates, and ureas. biphenyl, naphthyl, and the like, which may bear one or more Acyl substituents include, but are not limited to, any of the substituents. Aryl substituents include, but are not limited to, substituents described herein, that result in the formation of a any of the substituents described herein, that result in the stable moiety. formation of a stable moiety. The term “arylene, as used 0054 The term “acylene.” as used herein, is a subset of a herein refers to an aryl biradical derived from an aryl group, substituted alkylene, substituted alkenylene, substituted alky as defined herein, by removal of two hydrogen atoms. nylene, substituted heteroalkylene, substituted heteroalk Arylene groups may be substituted or unsubstituted. Arylene enylene, or Substituted heteroalkynylene group, and refers to group Substituents include, but are not limited to, any of the an acyl group having the general formulae: -R- substituents described herein, that result in the formation of a (C—X) R9 R9 X(C—X) R9 , or R X? stable moiety. Additionally, arylene groups may be incorpo (C=X')X R' , where X, X, and X is, independently, rated as a linker group into an alkylene, alkenylene, alky oxygen, sulfur, or NR, wherein R is hydrogen or optionally nylene, heteroalkylene, heteroalkenylene, or heteroalky substituted aliphatic, and R is an optionally substituted alky nylene group, as defined herein. lene, alkenylene, alkynylene, heteroalkylene, heteroalk US 2014/003 0697 A1 Jan. 30, 2014 enylene, or heteroalkynylene group, as defined herein. Exem recognition motif. For example, an agent may be a protein, an plary acylene groups wherein R' is alkylene includes amino acid, a peptide, a polynucleotide, a carbohydrate, a —(CH2)—O(C=O)—(CH2)—; —(CH), NR detectable label, a binding agent, a tag, a metal atom, a con (C=O)—(CH)— —(CH)—O(C=NR)-(CH2)—: trast agent, a catalyst, a non-polypeptide polymer, a synthetic polymer, a recognition element, a lipid, a linker, or chemical compound, such as a Small molecule. In some embodiments, the agent is a binding agent, for example, a ligand oraligand binding molecule, Streptavidin, biotin, an antibody oran anti body fragment. In some embodiments, the agent cannot be (CH2)—, or —(CH2), S(C=O)—(CH2)—, and the like, genetically encoded. In some Such embodiments, the agent is which may bear one or more Substituents; and wherein each a lipid, a carbohydrate, or a small molecule. Additional agents instance of T is, independently, an integer between 0 to 20. suitable for use in embodiments of the present invention will Acylene substituents include, but are not limited to, any of the be apparent to the skilled artisan. The invention is not limited substituents described herein, that result in the formation of a in this respect. stable moiety. 0062. The term “amino acid,” as used herein, includes any 0055. The term “amino.” as used herein, refers to a group naturally occurring and non-naturally occurring amino acid. of the formula ( NH). A “substituted amino” refers either There are many known non-natural amino acids any of which to a mono-substituted amine ( NHR') of a disubstituted may be included in the polypeptides or proteins described amine ( NR'2), wherein the R" substituent is any substituent herein. See, for example, S. Hunt, The Non-Protein Amino as described herein that results in the formation of a stable Acids. In Chemistry and Biochemistry of the Amino Acids, moiety (e.g., an amino protecting group; aliphatic, alkyl, alk edited by G. C. Barrett, Chapman and Hall, 1985. Some enyl, alkynyl, heteroaliphatic, heterocyclic, aryl, heteroaryl, non-limiting examples of non-natural amino acids are 4-hy acyl, amino, nitro, hydroxyl, thiol, halo, aliphaticamino, het droxyproline, desmosine, gamma-aminobutyric acid, beta eroaliphaticamino, alkylamino, heteroalkylamino, ary cyanoalanine, norvaline, 4-(E)-butenyl-4(R)-methyl-N-me lamino, heteroarylamino, alkylaryl, arylalkyl, aliphaticoxy, thyl-L-threonine, N-methyl-L-leucine, 1-amino heteroaliphaticoxy, alkyloxy, heteroalkyloxy, aryloxy, het cyclopropanecarboxylic acid, 1-amino-2-phenyl eroaryloxy, aliphaticthioxy, heteroaliphaticthioxy, alkylth cyclopropanecarboxylic acid, 1-amino ioxy, heteroalkylthioxy, arylthioxy, heteroarylthioxy, acy cyclobutanecarboxylic acid, 4-amino loxy, and the like, each of which may or may not be further cyclopentenecarboxylic acid, 3-amino substituted). In certain embodiments, the R" substituents of cyclohexanecarboxylic acid, 4-piperidylacetic acid, the di-substituted amino group ( NR'2) form a 5- to 6-mem 4-amino-1-methylpyrrole-2-carboxylic acid, 2,4-diaminobu bered heterocyclic ring. tyric acid, 2,3-diaminopropionic acid, 2,4-diaminobutyric 0056. The term “hydroxy” or “hydroxyl,” as used herein, acid, 2-aminoheptanedioic acid, 4-(aminomethyl)benzoic refers to a group of the formula (—OH). A “substituted acid, 4-aminobenzoic acid, ortho-, meta- and para-Substituted hydroxyl” refers to a group of the formula (—OR), wherein phenylalanines (e.g., Substituted with —C(=O)CHs. R' can be any substituent which results in a stable moiety —CF; —CN; -halo: —NO; —CH), disubstituted pheny (e.g., a hydroxyl protecting group; aliphatic, alkyl, alkenyl, lalanines, substituted tyrosines (e.g., further substituted with alkynyl, heteroaliphatic, heterocyclic, aryl, heteroaryl, acyl, —C(=O)CHs: —CF; —CN; -halo: NO. —CH), and nitro, alkylaryl, arylalkyl, and the like, each of which may or statine. In the context of amino acid sequences, “X” or “Xaa’ may not be further substituted). represents any amino acid residue, e.g., any naturally occur 0057 The term “thio’ or “thiol, as used herein, refers to a ring and/or any non-naturally occurring amino acid residue. group of the formula (—SH). A “substituted thiol refers to a 0063. The term “antibody', as used herein, refers to a group of the formula (—SR), wherein R can be any substitu protein belonging to the immunoglobulin Superfamily. The ent that results in the formation of a stable moiety (e.g., a thiol terms antibody and immunoglobulin are used interchange protecting group; aliphatic, alkyl, alkenyl, alkynyl, het ably. With Some exceptions, mammalian antibodies are typi eroaliphatic, heterocyclic, aryl, heteroaryl, acyl, Sulfinyl, Sul cally made of basic structural units each with two large heavy fonyl, cyano, nitro, alkylaryl, arylalkyl, and the like, each of chains and two small light chains. There are several different which may or may not be further substituted). types of antibody heavy chains, and several different kinds of 0058. The term “imino.” as used herein, refers to a group antibodies, which are grouped into different isotypes based of the formula (—NR), wherein R corresponds to hydrogen on which heavy chain they possess. Five different antibody or any substituent as described herein, that results in the isotypes are known in mammals, IgG, IgA, IgE. Ig|D, and formation of a stable moiety (for example, an amino protect IgM, which perform different roles, and help direct the appro ing group; aliphatic, alkyl, alkenyl, alkynyl, heteroaliphatic, priate immune response for each different type of foreign heterocyclic, aryl, heteroaryl, acyl, amino, hydroxyl, alky object they encounter. In some embodiments, an antibody is laryl, arylalkyl, and the like, each of which may or may not be an IgG antibody, e.g., an antibody of the IgG1, 2, 3, or 4 further substituted). human Subclass. Antibodies from mammalian species (e.g., 0059. The term “azide' or “azido, as used herein, refers to human, mouse, rat, goat, pig, horse, cattle, camel) are within a group of the formula (—N). the scope of the term, as are antibodies from non-mammalian 0060. The terms “halo' and “halogen, as used herein, species (e.g., from birds, reptiles, amphibia) are also within refer to an atom selected from fluorine (fluoro. —F), chlorine the scope of the term, e.g., IgY antibodies. (chloro. —Cl), bromine (bromo. —Br), and iodine (iodo, 0064 Only part of an antibody is involved in the binding of —I). the antigen, and antigen-binding antibody fragments, their 0061 The term "agent as used herein, refers to any mol preparation and use, are well known to those of skill in the art. ecule, entity, or moiety that can be conjugated to a sortase As is well-known in the art, only a small portion of an anti US 2014/003 0697 A1 Jan. 30, 2014 11 body molecule, the paratope, is involved in the binding of the partner with high specificity. Examples for binding agents antibody to its epitope (see, in general, Clark, W. R. (1986) include, without limitation, antibodies, antibody fragments, The Experimental Foundations of Modern Immunology nucleic acid molecules, receptors, ligands, aptamers, and Wiley & Sons, Inc., New York; Roitt, I. (1991) Essential adnectins. Immunology, 7th Ed., Blackwell Scientific Publications, 0067. The term “click chemistry” refers to a chemical Oxford). Suitable antibodies and antibody fragments for use philosophy introduced by K. Barry Sharpless of The Scripps in the context of some embodiments of the present invention Research Institute, describing chemistry tailored to generate include, for example, human antibodies, humanized antibod covalent bonds quickly and reliably by joining Small units ies, domain antibodies, F(ab'), F(ab'), Fab, Fv, Fc, and Fd comprising reactive groups together (see H. C. Kolb, M. G. fragments, antibodies in which the Fc and/or FR and/or Finn and K. B. Sharpless (2001). Click Chemistry: Diverse CDR1 and/or CDR2 and/or light chain CDR3 regions have Chemical Function from a Few Good Reactions. Angewandte been replaced by homologous human or non-human Chemie International Edition 40 (11): 2004-2021. Click sequences; antibodies in which the FR and/or CDR1 and/or chemistry does not refer to a specific reaction, but to a concept CDR2 and/or light chain CDR3 regions have been replaced including, but not limited to, reactions that mimic reactions by homologous human or non-human sequences; antibodies found in nature. In some embodiments, click chemistry reac in which the FR and/or CDR1 and/or CDR2 and/or light chain tions are modular, wide in scope, give high chemical yields, CDR3 regions have been replaced by homologous human or generate inoffensive byproducts, are stereospecific, exhibit a non-human sequences; and antibodies in which the FR and/or largethermodynamic driving force>84 kJ/mol to favora reac CDR1 and/or CDR2 regions have been replaced by homolo tion with a single reaction product, and/or can be carried out gous human or non-human sequences. In some embodiments, under physiological conditions. In some embodiments, a so-called single chain antibodies (e.g., ScPV), (single) click chemistry reaction exhibits high atom economy, can be domain antibodies, and other intracellular antibodies may be carried out under simple reaction conditions, use readily used in the context of the present invention. Domain antibod available starting materials and reagents, uses no toxic Sol ies, camelid and camelized antibodies and fragments thereof, vents or use a solvent that is benign or easily removed (pref for example, VHH domains, or nanobodies, such as those erably water), and/or provides simple product isolation by described in patents and published patent applications of non-chromatographic methods (crystallisation or distilla Ablynx NV and Domantis are also encompassed in the term tion). antibody. Further, chimeric antibodies, e.g., antibodies com 0068. The term "click chemistry handle.” as used herein, prising two antigen-binding domains that bind to different refers to a reactant, or a reactive group, that can partake in a antigens, are also suitable for use in the context of some click chemistry reaction. For example, a strained alkyne, e.g., embodiments of the present invention. a cyclooctyne, is a click chemistry handle, since it can partake 0065. The term “antigen-binding antibody fragment, as in a strain-promoted cycloaddition (see, e.g., Table 1). In used herein, refers to a fragment of an antibody that com general, click chemistry reactions require at least two mol prises the paratope, or a fragment of the antibody that binds to ecules comprising click chemistry handles that can react with the antigen the antibody binds to, with similar specificity and each other. Such click chemistry handle pairs that are reactive affinity as the intact antibody. Antibodies, e.g., fully human with each other are sometimes referred to herein as partner monoclonal antibodies, may be identified using phage dis click chemistry handles. For example, an azide is a partner play (or other display methods such as yeast display, ribo click chemistry handle to a cyclooctyne or any other alkyne. Some display, bacterial display). Display libraries, e.g., phage Exemplary click chemistry handles Suitable for use according display libraries, are available (and/or can be generated by to some aspects of this invention are described herein, for one of ordinary skill in the art) that can be screened to identify example, in Tables 1 and 2. Other suitable click chemistry an antibody that binds to an antigen of interest, e.g., using handles are known to those of skill in the art. For two mol panning. See, e.g., Sidhu, S. (ed.) Phage Display in Biotech ecules to be conjugated via click chemistry, the click chem nology and Drug Discovery (Drug Discovery Series: CRC istry handles of the molecules have to be reactive with each Press; 1 ed., 2005; Aitken, R. (ed.) Antibody Phage Display: other, for example, in that the reactive moiety of one of the Methods and Protocols (Methods in Molecular Biology) click chemistry handles can react with the reactive moiety of Humana Press; 2nd ed., 2009. the second click chemistry handle to form a covalent bond. 0066. The term “binding agent, as used herein refers to Such reactive pairs of click chemistry handles are well known any molecule that binds another molecule with high affinity. to those of skill in the art and include, but are not limited to, In some embodiments, a binding agent binds its binding those described in Table 1: TABLE 1

Exemplary click chemistry handles and reactions.

2 NN-R2 1,3-dipolar-r cycloaddition R-E + NE terminal alkyne US 2014/003 0697 A1 Jan. 30, 2014 12

TABLE 1-continued Exemplary click chemistry handles and reactions. N Y R. Strain-promoted cycloaddition N2 N N

--R + N=š-š-R, -e- --R

azide strained alkyne 21 R - R Diels-Aider reaction N dienophile R R diene

R-S Thiol-ene reaction

RHSH -- He thiol R R2

alkene

R,R, and R2 may represent any molecule comprising a sortase recognition motif In some embodiments, each ocurrence of R, R1, and R2 is independently RR-LPXT-X), , or—X), LPXT-R.R. wherein each occurrence of Xindependently represents any amino acid residue, each occurrence of y is an integer between 0 and 10, inclusive, and each occurrence of RR independently represents a protein or an agent (e.g., a protein, peptide, a detectable label, a binding agent, a small molecule, etc.), and, optionally, a linker,

0069 In some embodiments, click chemistry handles are handles described in Becer, Hoogenboom, and Schubert, used that can react to form covalent bonds in the absence of a Click Chemistry beyond Metal-Catalyzed Cycloaddition, metal catalyst. Such click chemistry handles are well known Angewandte Chemie International Edition (2009) 48: 4900 to those of skill in the art and include the click chemistry 4908: TABLE 2

Exemplary click chemistry handles and reactions.

Reagent A Reagent B Mechanism Notes on reaction) Reference

O azide alkyne Cu-catalyzed 3 + 2 azide-alkyne 2 h at 60° C. in HO 9) cycloaddition (CuAAC) 1 azide cyclooctyne strain-promoted 3 + 2 azide-alkyne 1 h at RT 6-8, 10, 11 cycloaddition (SPAAC) 2 azide activated 3 + 2 Huisgen cycloaddition 4h at 50° C. 12 alkyne 3 azide electron-deficient 3 + 2 cycloaddition 12 hat RT in HO 13 alkyne 4 azide aryne 3 + 2 cycloaddition 4h at RT in THF with crown ether or 14, 15 24h at RT in CHCN 5 tetrazine alkene Diels-Alder retro-4 + 2 cycloaddition 40 min at 25°C. (100% yield) 36-38) N is the only by-product 6 tetrazole alkene 1,3-dipolar cycloaddition few min UV irradiation and then overnight 39, 40 (photoclick) at 4°C. 7 dithioester diene hetero-Diels-Alder cycloaddition 10 min at RT 43 8 anthracene maleimide 4 + 2 Diels-Alder reaction 2 days at reflux in toluene 41 9 thiol alkene radical addition 30 min UV (quantitative conv.) or 19-23) (thio click) 24h UV irradiation (>96%) 10 thiol (Oile Michael addition 24h at RT in CHCN 27) 11 thiol maleimide Michael addition 1 h at 40°C. in THF or 24-26 16 hat RT in dioxane US 2014/003 0697 A1 Jan. 30, 2014

TABLE 2-continued Exemplary click chemistry handles and reactions. Reagent A Reagent B Mechanism Notes on reaction) Reference 12 thiol para-fluoro nucleophilic Substitution overnight at RT in DMF or 32 60 min at 40°C. in DMF 13 amine para-fluoro nucleophilic Substitution 20 min MW at 95°C. in NMP as Solvent 30 IRT = room temperature, DMF = N,N-dimethylformamide, NMP = N-methylpyrolidone, THF = tetrahydrofuran, CH3CN = acetonitrile.

0070 The term “conjugated' or “conjugation” refers to an 'Gd, 'Yb, and 'Re; b) a label which contains an immune association of two molecules, for example, two proteins or a moiety, which may be antibodies or antigens, which may be protein and an agent, e.g., a small molecule, with one another bound to enzymes (e.g., Such as horseradish peroxidase); c) a in a way that they are linked by a director indirect covalent or label which is a colored, luminescent, phosphorescent, or non-covalent interaction. In certain embodiments, the asso fluorescent moieties (e.g., Such as the fluorescent label fluo ciation is covalent, and the entities are said to be "conjugated rescein-isothiocyanate (FITC); d) a label which has one or to one another. In some embodiments, a protein is post-trans more photo affinity moieties; and e) a label which is a ligand lationally conjugated to another molecule, for example, a for one or more known binding partners (e.g., biotin-strepta second protein, a small molecule, a detectable label, a click vidin, FK506-FKBP). In certain embodiments, a label com chemistry handle, or a binding agent, by forming a covalent prises a radioactive isotope, preferably an isotope which bond between the protein and the other molecule after the emits detectable particles, such as B particles. In certain protein has been formed, and, in Some embodiments, after the embodiments, the label comprises a fluorescent moiety. In protein has been isolated. In some embodiments, two mol certain embodiments, the label is the fluorescent label fluo ecules are conjugated via a linker connecting both molecules. rescein-isothiocyanate (FITC). In certain embodiments, the For example, in some embodiments where two proteins are label comprises a ligand moiety with one or more known conjugated to each other to form a protein fusion, the two binding partners. In certain embodiments, the label comprises proteins may be conjugated via a polypeptide linker, e.g., an biotin. In some embodiments, a label is a fluorescent polypep amino acid sequence connecting the C-terminus of one pro tide (e.g., GFP or a derivative thereof such as enhanced GFP tein to the N-terminus of the other protein. In some embodi (EGFP)) or a luciferase (e.g., a firefly, Renilla, or Gaussia ments, two proteins are conjugated at their respective C-ter luciferase). It will be appreciated that, in certain embodi mini, generating a C-C conjugated chimeric protein. In ments, a label may react with a suitable Substrate (e.g., a Some embodiments, two proteins are conjugated at their luciferin) to generate a detectable signal. Non-limiting respective N-termini, generating an N N conjugated chi examples of fluorescent proteins include GFP and derivatives meric protein. In some embodiments, conjugation of a protein thereof, proteins comprising fluorophores that emit light of to a peptide is achieved by transpeptidation using a sortase. different colors such as red, yellow, and cyan fluorescent See, e.g., Ploegh et al., International PCT Patent Application, proteins. Exemplary fluorescent proteins include, e.g., Sirius, PCT/US2010/000274, filed Feb. 1, 2010, published as AZurite, EBFP2, TagBFP. mTurquoise, ECFP Cerulean, WO/2010/087994 on Aug. 5, 2010, and Ploegh et al., Inter TagCFP. mTFP1, mOkG1, m.AG1, AcGFP1, TagGFP2, national Patent Application PCT/US2011/033303, filed Apr. EGFP mWasabi, EmGFP. TagYPF, EYFP, Topaz, SYFP2, 20, 2011, published as WO/2011/133704 on Oct. 27, 2011, Venus, Citrine, mKO, mKO2, mOrange, mOrange2, TagRFP. the entire contents of each of which are incorporated herein TagRFP-T mStrawberry, mRuby, mCherry, mRaspberry, by reference, for exemplary Sortases, proteins, recognition mKate2, mPlum, mNeptune, T-Sapphire, mametrine, motifs, reagents, and methods for Sortase-mediated transpep mKeima. See, e.g., Chalfie, M. and Kain, SR (eds.) Green tidation. fluorescent protein: properties, applications, and protocols (0071. The term “detectable label” refers to a moiety that Methods of biochemical analysis, V. 47 Wiley-Interscience, has at least one element, isotope, or functional group incor Hoboken, N.J., 2006; and Chudakov, DM, et al., Physiol Rev. porated into the moiety which enables detection of the mol 90(3):1103-63, 2010, for discussion of GFP and numerous ecule, e.g., a protein or peptide, or other entity, to which the other fluorescent or luminescent proteins. In some embodi label is attached. Labels can be directly attached (i.e., via a ments, a label comprises a dark quencher, e.g., a Substance bond) or can be attached by a linker (such as, for example, an that absorbs excitation energy from a fluorophore and dissi optionally Substituted alkylene; an optionally substituted alk pates the energy as heat. enylene; an optionally Substituted alkynylene; an optionally 0072 The term “linker,” as used herein, refers to a chemi substituted heteroalkylene; an optionally substituted het cal group or molecule covalently linked to a molecule, for eroalkenylene; an optionally Substituted heteroalkynylene; example, a protein, and a chemical group or moiety, for an optionally Substituted arylene; an optionally Substituted example, a click chemistry handle. In some embodiments, the heteroarylene; or an optionally Substituted acylene, or any linker is positioned between, or flanked by, two groups, mol combination thereof, which can make up a linker). It will be ecules, or moieties and connected to each one via a covalent appreciated that the label may be attached to or incorporated bond, thus connecting the two. In some embodiments, the into a molecule, for example, a protein, polypeptide, or other linker is an amino acid or a plurality of amino acids. In some entity, at any position. In general, a detectable label can fall embodiments, the linker comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. into any one (or more) of five classes: a) a label which con 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more than 20 amino tains isotopic moieties, which may be radioactive or heavy acids. In some embodiments, the linker comprises a poly isotopes, including, but not limited to, H, H, C, C, N, glycine sequence. In some embodiments, the linker com 18F 3P. 32p, 35s, 7Ga, 99mTc (Tc-99m), In, 123, 125I, 13 II, prises a GGGGS sequence (SEQID NO: 19), or a plurality of US 2014/003 0697 A1 Jan. 30, 2014 such sequences, e.g., a GGGGSGGGGS sequence (SEQ ID carbohydrate group, a hydroxyl group, a phosphate group, a NO: 20). In some embodiments, the linker comprises a non farnesyl group, an isofarnesyl group, a fatty acid group, a protein structure. In some embodiments, the linker is an linker for conjugation, functionalization, or other modifica organic molecule, group, polymer, or chemical moiety. tion, etc. A protein, peptide, or polypeptide may also be a 0073. The terms “nucleic acid' and “nucleic acid mol single molecule or may be a multi-molecular complex. A ecule, as used herein, refer to a compound comprising a protein, peptide, or polypeptide may be just a fragment of a nucleobase and an acidic moiety, e.g., a nucleoside, a nucle naturally occurring protein or peptide. A protein, peptide, or otide, or a polymer of nucleotides. Typically, polymeric polypeptide may be naturally occurring, recombinant, or syn nucleic acids, e.g., nucleic acid molecules comprising three thetic, or any combination thereof. or more nucleotides are linear molecules, in which adjacent 0075. The term “small molecule' is used herein to refer to nucleotides are linked to each other via a phosphodiester molecules, whether naturally-occurring or artificially created linkage. In some embodiments, “nucleic acid refers to indi (e.g., via chemical synthesis) that have a relatively low vidual nucleic acid residues (e.g. nucleotides and/or nucleo molecular weight. Typically, a small molecule is an organic sides). In some embodiments, “nucleic acid refers to an compound (i.e., it contains carbon). A small molecule may oligonucleotide chain comprising three or more individual contain multiple carbon-carbon bonds, Stereocenters, and nucleotide residues. As used herein, the terms "oligonucle other functional groups (e.g., amines, hydroxyl, carbonyls, otide' and “polynucleotide' can be used interchangeably to heterocyclic rings, etc.). In some embodiments, Small mol refer to a polymer of nucleotides (e.g., a string of at least three ecules are monomeric and have a molecular weight of less nucleotides). In some embodiments, “nucleic acid encom than about 1500 g/mol. In certain embodiments, the molecu passes RNA as well as single and/or double-stranded DNA. lar weight of the small molecule is less than about 1000 g/mol Nucleic acids may be naturally occurring, for example, in the or less than about 500 g/mol. In certain embodiments, the context of a genome, a transcript, an mRNA, tRNA, rRNA, Small molecule is a drug, for example, a drug that has already siRNA. SnRNA, a , , chromosome, chromatid, been deemed safe and effective for use in humans or or other naturally occurring nucleic acid molecule. On the by the appropriate governmental agency or regulatory body. other hand, a nucleic acid molecule may be a non-naturally 0076. The term "sortase,” as used herein, refers to an occurring molecule, e.g., a recombinant DNA or RNA, an enzyme able to carry out a transpeptidation reaction conju artificial chromosome, an engineered genome, or fragment gating the C-terminus of a protein to the N-terminus of a thereof, or a synthetic DNA, RNA, DNA/RNA hybrid, or protein via transamidation. Sortases are also referred to as including non-naturally occurring nucleotides or nucleo transamidases, and typically exhibit both a protease and a sides. Furthermore, the terms “nucleic acid, “DNA,” “RNA. transpeptidation activity. Various Sortases from prokaryotic and/or similar terms include nucleic acid analogs, i.e. analogs organisms have been identified. For example, Some Sortases having other than a phosphodiester backbone. Nucleic acids from Gram-positive cleave and translocate proteins can be purified from natural sources, produced using recom to proteoglycan moieties in intact cell walls. Among the Sor binant expression systems, chemically synthesized, and, tases that have been isolated from Staphylococcus aureus, are optionally, purified. Where appropriate, e.g., in the case of sortase A (Srt A) and sortase B (Srt B). Thus, in certain chemically synthesized molecules, nucleic acids can com embodiments, a transamidase used in accordance with the prise nucleoside analogs such as analogs having chemically present invention is sortase A, e.g., from S. aureus, also modified bases or Sugars, and backbone modifications. In referred to herein as SrtA. In certain embodiments, a Some embodiments, a nucleic acid is or comprises natural transamidase is a sortase B, e.g., from S. aureus, also referred nucleosides (e.g. adenosine, thymidine, guanosine, cytidine, to herein as SrtB. uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, 0077 Sortases have been classified into 4 classes, desig and deoxycytidine); nucleoside analogs (e.g., 2-aminoad nated A, B, C, and D, designated Sortase A, Sortase B, Sortase enosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, C, and Sortase D, respectively, based on sequence alignment 3-methyl adenosine, 5-methylcytidine, 2-aminoadenosine, and phylogenetic analysis of 61 sortases from Gram-positive C5-bromouridine, C5-fluorouridine, C5-iodouridine, bacterial genomes (Dramsi S. Trieu-Cuot P. Bierne H, Sorting C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcyti Sortases: a nomenclature proposal for the various Sortases of dine, 2-aminoadeno sine, 7-deazaadenosine, 7-deazagua Gram-positive bacteria. Res Microbiol. 156(3):289-97, 2005; nosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylgua the entire contents of which are incorporated herein by refer nine, and 2-thiocytidine); chemically modified bases; ence). These classes correspond to the following Subfamilies, biologically modified bases (e.g., methylated bases); interca into which sortases have also been classified by Comfort and lated bases; modified Sugars (e.g. 2'-fluororibose, ribose, Clubb (Comfort D. Clubb RT. A comparative genome analy 2'-deoxyribose, arabinose, and hexose); and/or modified sis identifies distinct sorting pathways in gram-positive bac phosphate groups (e.g., phosphorothioates and 5'-N-phos teria. Infect Immun., 72(5):2710-22, 2004; the entire con phoramidite linkages). tents of which are incorporated herein by reference): Class A 0074 The terms “protein,” “peptide' and “polypeptide' (Subfamily 1), Class B (Subfamily 2), Class C (Subfamily 3), are used interchangeably herein, and refer to a polymer of Class D (Subfamilies 4 and 5). The aforementioned refer amino acid residues linked together by peptide (amide) ences disclose numerous Sortases and recognition motifs. See bonds. The terms refer to a protein, peptide, or polypeptide of also Pallen, M. J.; Lam, A. C.; Antonio, M.; Dunbar, K. any size, structure, or function. Typically, a protein, peptide, TRENDS in Microbiology, 2001,9(3),97-101; the entire con or polypeptide will be at least three amino acids long. A tents of which are incorporated herein by reference. Those protein, peptide, or polypeptide may refer to an individual skilled in the art will readily be able to assign a sortase to the protein or a collection of proteins. One or more of the amino correct class based on its sequence and/or other characteris acids in a protein, peptide, or polypeptide may be modified, tics such as those described in Drami, et al., Supra. The term for example, by the addition of a chemical entity Such as a “sortase A' is used herein to refer to a class A sortase, usually US 2014/003 0697 A1 Jan. 30, 2014

named SrtA in any particular bacterial species, e.g., SrtA 0081. In some embodiments of the invention the sortase is from S. aureus. Likewise "sortase B is used hereinto refer to a sortase B (SrtB), e.g., a sortase B of S. aureus, B. anthracis, a class B Sortase, usually named SrtB in any particular bac or L. monocytogenes. Motifs recognized by Sortases of the B terial species, e.g., SrtB from S. aureus. The invention encom class (SrtB) often fall within the consensus sequences passes embodiments relating to a sortase A from any bacterial NPXTX, e.g., NPQ/K-T/sHN/G/s), such as NPQTN (SEQ species or strain. The invention encompasses embodiments ID NO: 33) or NPKTG (SEQ ID NO. 34). For example, relating to a Sortase B from any bacterial species or strain. The sortase B of S. aureus or B. anthracis cleaves the NPQTN invention encompasses embodiments relating to a class C (SEQID NO:35) or NPKTG (SEQID NO:36) motif of IsdC Sortase from any bacterial species or strain. The invention in the respective bacteria (see, e.g., Marraffini, L. and Schnee encompasses embodiments relating to a class D Sortase from wind, O. Journal of Bacteriology, 189(17), p. 6425-6436, any bacterial species or strain. 2007). Other recognition motifs found in putative substrates of class B sortases are NSKTA (SEQ ID NO: 37), NPQTG 0078 Amino acid sequences of Srt A and Srt B and the (SEQ ID NO:38), NAKTN (SEQID NO:39), and NPQSS nucleotide sequences that encode them are known to those of (SEQID NO: 40). For example, SrtB from L. monocytogenes skill in the art and are disclosed in a number of references recognizes certain motifs lacking Pat position 2 and/or lack cited herein, the entire contents of all of which are incorpo rated herein by reference. The amino acid sequences of S. ing Q or Kat position 3, such as NAKTN (SEQID NO: 41) aureus SrtA and SrtB are homologous, sharing, for example, and NPQSS (SEQ ID NO: 42) (Mariscotti J. F. García-Del 22% sequence identity and 37% sequence similarity. The Portillo F. Pucciarelli M G. The listeria monocytogenes sor amino acid sequence of a Sortase-transamidase from Staphy tase-B recognizes varied amino acids at position two of the lococcus aureus also has substantial homology with sorting motif. J Biol Chem. 2009 Jan. 7.) sequences of enzymes from other Gram-positive bacteria, I0082 In some embodiments, the sortase is a sortase C (Srt and Such transamidases can be utilized in the ligation pro C). Sortase C may utilize LPXTX as a recognition motif, with cesses described herein. For example, for SrtAthere is about each occurrence of X independently representing any amino a 31% sequence identity (and about 44% sequence similarity) acid residue. with best alignment over the entire sequenced region of the S. I0083. In some embodiments, the sortase is a sortase D (Srt pyogenes open reading frame. There is about a 28% sequence D). Sortases in this class are predicted to recognize motifs identity with best alignment over the entire sequenced region with a consensus sequence NA-E/A/S/H-TG (Comfort D. of the A. naeslundii open reading frame. It will be appreciated Supra). Sortase D has been found, e.g., in Streptomyces spp., that different bacterial strains may exhibit differences in Corynebacterium spp., Tropheryma whipplei, Thermobifida sequence of a particular polypeptide, and the sequences fisca, and Bifidobacterium longhum. LPXTA (SEQ ID NO: herein are exemplary. 43) or LAXTG (SEQID NO. 44) may serve as a recognition 0079. In certain embodiments a transamidase bearing 18% sequence for Sortase D, e.g., of Subfamilies 4 and 5, respec or more sequence identity, 20% or more sequence identity, or tively subfamily-4 and subfamily-5 enzymes process the 30% or more sequence identity with an S. pyogenes, A. motifs LPXTA (SEQID NO: 45) and LAXTG (SEQID NO: naeslundii, S. mutans, E. faecalis or B. subtilis open reading 46), respectively). For example, B. anthracis Sortase C has frame encoding a sortase can be screened, and enzymes hav been shown to specifically cleave the LPNTA (SEQID NO: ing transamidase activity comparable to Srt A or Srt B from S. 47) motif in B. anthracis BasI and Bash (see Marrafini, aureas can be utilized (e.g., comparable activity sometimes is Supra). 10% of Srt A or Srt B activity or more). I0084. See Barnett and Scott for description of a sortase 0080 Thus in some embodiments of the invention the that recognizes QVPTGV (SEQID NO: 48) motif (Barnett, T sortase is a sortase A (SrtA). SrtA recognizes the motif C and Scott, JR. Differential Recognition of Surface Proteins LPXTX (wherein each occurrence of X represents indepen in Streptococcus pyogenes by Two Sortase Gene Homologs. dently any amino acid residue), with common recognition Journal of Bacteriology, Vol. 184, No. 8, p. 2181-2191, 2002: motifs being, e.g., LPKTG (SEQ ID NO: 21), LPATG (SEQ the entire contents of which are incorporated herein by refer IDNO: 22), LPNTG (SEQIDNO:23). In some embodiments ence). Additional Sortases, including, but not limited to, Sor LPETG (SEQID NO: 10) is used as the sortase recognition tases recognizing additional Sortase recognition motifs are motif. However, motifs falling outside this consensus may also suitable for use in some embodiments of this invention. also be recognized. For example, in some embodiments the For example, sortases described in Chen I, Dorr BM, and Liu motif comprises an 'A' rather than a T at position 4, e.g., D R. A general strategy for the evolution of bond-forming LPXAG (SEQID NO:24), e.g., LPNAG (SEQID NO:25). In enzymes using yeast display. Proc Natl AcadSci USA. 2011 Some embodiments the motif comprises an 'A' rather than a Jul. 12; 108(28): 11399, the entire contents of which are incor “G” at position5, e.g., LPXTA (SEQID NO: 26), e.g., LPNTA porated herein. (SEQID NO: 27). In some embodiments the motif comprises I0085. The use of sortases found in any gram-positive a 'G' rather than Pat position 2, e.g., LGXTG (SEQID NO: organism, such as those mentioned herein and/or in the ref 28), e.g., LGATG (SEQ ID NO: 29). In some embodiments erences (including databases) cited herein is contemplated in the motif comprises an I rather than L at position 1, e.g., the context of some embodiments of this invention. Also IPXTG (SEQID NO:30), e.g., IPNTG (SEQID NO:31) or contemplated is the use of Sortases found in gram negative IPETG (SEQID NO:32). Additional suitable sortase recog bacteria, e.g., Colwellia psychrerythraea, Microbulbifer nition motifs will be apparent to those of skill in the art, and degradans, Bradyrhizobium japonicum, Shewanella the invention is not limited in this respect. It will be appreci Oneidensis, and Shewanella putrefaciens. Such sortases rec ated that the terms “recognition motif and “recognition ognize sequence motifs outside the LPXTX consensus, for sequence', with respect to sequences recognized by a transa example, LPQ/KITA/SIT (SEQ ID NO: 289). In keeping midase or Sortase, are used interchangeably. with the variation tolerated at position 3 in sortases from US 2014/003 0697 A1 Jan. 30, 2014 gram-positive organisms, a sequence motif LPXTA/S, e.g., may be phosphorylated, thereby rendering it refractory to LPXTA (SEQID NO:49) or LPSTS (SEQID NO:50) may be recognition and cleavage by SrtA. The masked recognition used. sequence can be unmasked by treatment with a phosphatase, I0086 Those of skill in the art will appreciate that any thus allowing it to be used in a SrtA-catalyzed transamidation Sortase recognition motifknown in the art can be used in some reaction. 0089. The term "sortase substrate.” as used herein refers to embodiments of this invention, and that the invention is not any molecule that is recognized by a Sortase, for example, any limited in this respect. For example, in Some embodiments the molecule that can partake in a sortase-mediated transpeptida sortase recognition motif is selected from: LPKTG (SEQID tion reaction. A typical Sortase-mediated transpeptidation NO:51), LPITG (SEQ ID NO. 52), LPDTA (SEQ ID NO: reaction involves a Substrate comprising a C-terminal sortase 53), SPKTG (SEQ ID NO:54), LAETG (SEQ ID NO: 55), recognition motif, e.g., an LPXTX motif, and a second Sub LAATG (SEQ ID NO: 56), LAHTG (SEQ ID NO. 57), strate comprising an N-terminal Sortase recognition motif. LASTG (SEQ ID NO. 58), LAETG (SEQ ID NO. 59), e.g., an N-terminal polyglycine or polyalanine. A sortase LPLTG (SEQID NO:60), LSRTG (SEQIDNO: 61), LPETG Substrate may be a peptide or a protein, for example, a target (SEQ ID NO: 10), VPDTG (SEQID NO:62), IPQTG (SEQ protein on the Surface of a virus, or a peptide comprising a ID NO: 63), YPRRG (SEQ ID NO: 64), LPMTG (SEQ ID Sortase recognition motifsuch as an LPXTX motifora polyg NO: 65), LPLTG (SEQ ID NO: 66), LAFTG (SEQ ID NO: lycine or polyalanine, wherein the peptide is conjugated to an 67), LPQTS (SEQ ID NO: 68), it being understood that in agent, e.g., a small molecule, a binding agent, or a fluoro various embodiments of the invention the 5' residue may be phore. Accordingly, both proteins and non-protein molecules replaced with any other amino acid residue. For example, the can be sortase Substrates as long as they comprise a sortase sequence used may be LPXT, LAXT, LPXA, LGXT, IPXT, recognition motif. Some examples of Sortase Substrates are NPXT, NPQS (SEQ ID NO: 69), LPST (SEQ ID NO: 70), described in more detail elsewhere herein and additional suit NSKT (SEQ ID NO: 71), NPQT (SEQ ID NO: 72), NAKT able sortase substrates will be apparent to the skilled artisan. (SEQ ID NO: 73), LPIT (SEQ ID NO: 74), LAET (SEQ ID The invention is not limited in this respect. 0090 The term "sortagging,” as used herein, refers to the NO: 75), or NPQS (SEQID NO: 76). The invention encom process of adding a tag, e.g., a moiety or molecule, for passes embodiments in which X in any Sortase recognition example, a protein, polypeptide, detectable label, binding motif disclosed herein or known in the art is amino acid, for agent, or click chemistry handle, onto a target molecule, for example, any naturally-occurring or any non-naturally occur example, a target protein on the Surface of a viral particle via ringamino acid. In some embodiments, X is selected from the a sortase-mediated transpeptidation reaction. Examples of 20 standard amino acids found most commonly in proteins additional suitable tags include, but are not limited to, amino found in living organisms. In some embodiments, e.g., where acids, nucleic acids, polynucleotides, Sugars, carbohydrates, the recognition motif is LPXTG (SEQID NO: 78) or LPXT, polymers, lipids, fatty acids, and Small molecules. Other Suit X is D, E, A, N, Q, K, or R. In some embodiments, X in a able tags will be apparent to those of skill in the art and the particular recognition motif is selected from those amino invention is not limited in this aspect. In some embodiments, acids that occur naturally at position3 in a naturally occurring a tag comprises a sequence useful for purifying, expressing, Sortase Substrate. For example, in some embodiments X is solubilizing, and/or detecting a polypeptide. In some embodi selected from K, E, N, Q, A in an LPXTG (SEQID NO: 78) ments, a tag can serve multiple functions. In some embodi or LPXT motif where the sortase is a sortase A. In some ments, the tag is relatively small, e.g., ranging from a few embodiments X is selected from K, S, E, L, A, Ninan LPXTG amino acids up to about 100 amino acids long. In some (SEQIDNO: 78) or LPXTmotif and a class C sortase is used. embodiments, a tag is more than 100 amino acids long, e.g., 0087. In some embodiments, a sortase recognition up to about 500 amino acids long, or more. In some embodi sequence further comprises one or more additional amino ments, a tag comprises an HA, TAP, Myc, 6xHis, Flag, acids, e.g., at the N or C terminus. For example, one or more streptavidin, biotin, or GST tag, to name a few examples. In amino acids (e.g., up to 5 amino acids) having the identity of Some embodiments, a tag comprises a solubility-enhancing amino acids found immediately N-terminal to, or C-terminal tag (e.g., a SUMO tag, NUSA tag, SNUT tag, or a monomeric to, a 5 amino acid recognition sequence in a naturally occur mutant of the Ocr protein of bacteriophage T7). See, e.g., ring Sortase Substrate may be incorporated. Such additional Esposito D and Chatterjee DK. Curr Opin Biotechnol.; 17(4): amino acids may provide context that improves the recogni 353-8 (2006). In some embodiments, a tag is cleavable, so tion of the recognition motif. that it can be removed, e.g., by a protease. In some embodi 0088. In some embodiments, a sortase recognition motif is ments, this is achieved by including a protease cleavage site in masked. In contrast to an unmasked Sortase recognition the tag, e.g., adjacent or linked to a functional portion of the motif, which can be can be recognized by a Sortase, a masked tag. Exemplary proteases include, e.g., thrombin, TEV pro Sortase recognition motif is a motif that is not recognized by tease, Factor Xa, PreScission protease, etc. In some embodi a sortase but that can be readily modified (“unmasked') such ments, a “self-cleaving tag is used. See, e.g., Wood et al., that the resulting motif is recognized by the sortase. For International PCT Application PCT/US2005/05763, filed on example, in Some embodiments at least one amino acid of a Feb. 24, 2005, and published as WO/2005/086654 on Sep. 22, masked sortase recognition motif comprises a side chain 2005. comprising a moiety that inhibits, e.g., prevents, recognition 0091. The term “target protein, as used herein in the con of the sequence by a Sortase of interest, e.g., SrtA text of sortase-mediated modification of viral particles, refers Removal of the inhibiting moiety, in turn, allows recognition to a protein on the Surface of a virus that is the target of a of the motif by the Sortase. Masking may, for example, reduce Sortase-mediated conjugation. For example, in an embodi recognition by at least 80%, 90%. 95%, or more (e.g., to ment where M13 pII is modified by Sortagging, e.g., by undetectable levels) in certain embodiments. By way of adding a detectable label or a binding agent to M13 plII on the example, in certain embodiments a threonine residue in a surface of an M13 bacteriophage particle, pIII is the target sortase recognition motif such as LPXTG (SEQ ID NO: 78) protein. The term “target protein’ may refer to a wild type or US 2014/003 0697 A1 Jan. 30, 2014 naturally occurring form of the respective protein, or to an (e.g. influenza viruses); Bunyaviridae engineered form, for example, to a recombinant protein vari (e.g. Hantaan viruses, bunga viruses, phleboviruses and Nairo ant comprising a sortase recognition motif not contained in a viruses); Arenaviridae (hemorrhagic fever viruses); Reoviri wild-type form of the protein. The term “modifying a target dae (erg., reoviruses, orbiviurses and ); Birnaviri protein, as used herein in the context of Sortase-mediated dae: (Hepatitis B virus); (par protein modification, refers to a process of altering a target voviruses); Papovaviridae (papilloma viruses, polyoma protein comprising a Sortase recognition motif via a sortase viruses); : ( mediated transpeptidation reaction. Typically, the modifying (HSV) 1 and 2, varicella Zoster virus, results in the target protein being conjugated to an agent, for (CMV), EBV, KSV): Poxyiridae (variola viruses, vaccinia example, a peptide, protein, binding agent, detectable label, viruses, pox viruses); and Picornaviridae (e.g. polio viruses, or Small molecule. hepatitis A virus; , human coxsackie viruses, 0092. The term “virus, as used interchangeably herein , echoviruses). In some embodiments, the virus is with the term “viral particle.” refers to an infectious agent that a bacteriophage, for example, a bacteriophage belonging to can infect a living cell. A virus particle typically comprises the family of (e.g., T4 phage), (e.g., the viral genome, e.g., as DNA, RNA, or a DNA/RNA hybrid, k phage, Bacteriophage T5), (e.g., T7 phage), proteins associated with the viral genome that form a viral , , Rudiviridae, Ampullaviri coat, and, in some cases an envelope of lipids that Surrounds dae, Bacilloviridae, , , Corti the viral protein coat. In some embodiments, a viral particle coviridae, Cystoviridae, , , Gut comprises a viral genome that can replicate inside a host cell tavirus, Inoviridae, Leviviridae (e.g., MS2, QB), once the virus has infected the cell. In some embodiments, the (e.g., dX174), , or Tectiviridae. viral functions encoded in the viral genome result in the Exemplary Suitable bacteriophages include, without limita production of new viral particles by the host cell. In some tion, Lambda phage (w phage, lysogen). T2 phage, T4 phage, embodiments, the newly generated viral particles can them T7 phage, T12 phage, R17 phage, M13 phage, MS2 phage, selves infect additional host cells. Suitable viruses for use in G4 phage, P1 phage, Enterobacteria phage P2, P4 phage, the context of this invention typically comprise at least one dX174 phage, N4 phage, d6 phage, and d29 phage. Addi Surface protein comprising a sortase recognition motif. In tional bacteriophages Suitable for Surface functionalization Some embodiments, the Sortase recognition motif is com using methods, reagents, and kits provided herein will be prised in a wild-type viral protein (e.g., a capsid protein or a apparent to those of skill in the art. Suitable bacteriophages viral surface protein). In some embodiments, the sortase rec include, for example, bacteriophages described in Stephen T. ognition motif is encoded by a recombinant viral genome, Abedon, The Bacteriophages, Oxford University Press, e.g., a viral genome in which an open reading frame has been USA; 2' edition, Dec. 15, 2005, ISBN: 0195148509; par altered to insert a sortase recognition motif. A virus Suitable ticularly in parts III-V, pages 129-653; Elizabeth Kutter and for use according to aspects of this invention may be recom Alexander Sulakvelidze: Bacteriophages. Biology and Appli binant, and comprise genetic alterations other than the addi cations. CRC Press; 1' edition (December 2004), ISBN: tion of a sortase recognition motif to a Surface protein. For 0849313368; Martha R. J. Clokie and Andrew M. Kropinski: example, in some embodiment, a virus may be used that is Bacteriophages: Methods and Protocols, Volume 1: Isola replication-incompetent, or that carries in its genome a select tion, Characterization, and Interactions (Methods in Molecu able marker, e.g., an antibiotic resistance marker, that can be lar Biology) Humana Press; 1 edition (December, 2008), used to identify cells infected by the virus. Viruses can be ISBN: 1588296822; Martha R. J. Clokie and Andrew M. classified according to their genome structure and type of Kropinski: Bacteriophages. Methods and Protocols, Volume nucleic acid comprised in the respective viral particles. A 2: Molecular and Applied Aspects (Methods in Molecular Suitable virus according to aspects of this invention may be a Biology) Humana Press; 1 edition (December 2008), ISBN: dsDNA virus comprising a double-stranded DNA genome 1603275649; all of which are incorporated herein in their (e.g. adenoviruses, herpesviruses, poxviruses), an SSDNA entirety by reference for disclosure of suitable phages and virus comprising a single-stranded DNA genome (e.g. par host cells as well as methods and protocols for isolation, voviruses), a dsRNA virus comprising a double-stranded culture, and manipulation of Such phages. RNA genome (e.g. reoviruses), a (+)SSRNA virus comprising 0093. In some embodiments, the phage is a filamentous a single Stranded (+)sense Strand RNA genome (e.g. picor phage. In some embodiments, the phage is an M13 phage. naviruses, togaviruses), a (-)ssRNA virus comprising a Wild-type M13 phage particles comprise a circular, single single stranded (-)sense RNA (e.g. orthomyxoviruses, rhab Stranded genome of approximately 6.4 kb. The wild-type doviruses), an SSRNA-RT virus comprising a single-stranded genome includes ten genes, g-gX, which, in turn, encode the (+)sense RNA with a DNA intermediate genome in its ten M13 proteins, pI-pX, respectively. gVIII encodes pVIII, cycle that is generated by reverse transcription of the RNA also often referred to as the major structural protein of the genome (e.g. ), or a dsDNA-RT virus (e.g. hep phage particles, whileg|II encodes p, also referred to as the adnaviruses). Exemplary viruses include, e.g., Retroviridae minor coat protein, which is required for infectivity of M13 (e.g., lentiviruses such as human immunodeficiency viruses, phage particles. The M13 phage genome has extensively been Such as HIV-I); (e.g. strains that cause gastro studied and can be manipulated with recombinant techniques enteritis); Togaviridae (e.g. equine encephalitis viruses, well known to those of skill in the art. For example, one or rubella viruses); Flaviridae (e.g. dengue viruses, encephalitis more of the wild-type genes can be deleted in whole or in part, viruses, yellow fever viruses, hepatitis C virus); Coronaviri and/or a heterologous nucleic acid construct can be inserted dae (e.g. coronaviruses); (e.g. vesicular sto into the M13 genome. Such recombinant M13 phage matitis viruses, rabies viruses); (e.g. Ebola genomes can be packaged into M13 phage particles in the viruses); (e.g. parainfluenza viruses, presence of packaging proteins (e.g., pIII, pVI, pVII, pVIII, mumps virus, measles virus, respiratory syncytial virus); and pLX). The size of the M13 particles depends mainly on the US 2014/003 0697 A1 Jan. 30, 2014 18 size of the packaged genome. M13 does not have stringent Information (NCBI) database (www.ncbi.nlm.nih.gov) and genome size restrictions, and insertions of up to 42 kb have the ENSEMBL database (www.ensembl.org). An exemplary been reported. The M13 phage genome has been sequences, M13 genomic sequence is provided in entry V00604 of the and M13 genomic sequences can be retrieved from public National Center for Biotechnology Information (NCBI) data databases, such as the National Center for Biotechnology base (www.ncbi.nlm.nih.gov):

>gi 56713234 embVOO6O4.2 Phage M 3 genome (SEO ID NO : 79) AACGCTACTACTATTAGTAGAATTGATGCCACC TTCAGCTCGCGCCCCAAATGAAAATA TAG

CTAAACAGGTTATTGACCATTTGCGAAATGTATC TAATGGTCAAACTAAATCTACTCGTTCGCA

GAATTGGGAATCAACTGTTACATGGAA GAAAC CCAGACACCGTACTTTAGTTGCATAT TA

AAACATGTTGAGCTACAGCACCAGATTCAGCAA AAGCTCTAAGCCATCCGCAAAAATGACC

CTTATCAAAAGGAGCAATTAAAGGTAC CTCTAA CCTGACCTGTTGGAGTTTGCTTCCGG C

GGTTCGCTTTGAAGCTCGAAT AAAACGCGATA TGAAGTCTTTCGGGCTTCCT CTTAATCT

TTTGATGCAATCCGCTTTGCT CTGACT ATAATAGTCAGGGTAAAGACCTGATT TTGATT A.

GGTCATTCTCGTTTTCTGAACT TTAAAGCAT GAGGGGGATTCAATGAATA TTATGACGA

TTCCGCAGTATTGGACGCTATCCAGTC AAACA TTACTATTACCCCCTCTGGCAAAACT C

TTTGCAAAAGCCTCTCGCTATT TTGGT TTTATCGTCGTCTGGTAAACGAGGGT ATGATAGTG

TTGC CTTACTATGCCTCGTAATT CCT TTGGCGTTATGTATCTGCA TAGTTGAATGTGGTA

TCCTAAATCTCAACTGATGAA CTTTC ACCTGTAATAATGTTGTTCCGTTAGT CGTTTTAT

AACG AGATTTTTCTTCCCAACGT CCTGACTGGTATAATGAGCCAGT CTTAAAATCGCATAAG

GTAATTCACAATGATTAAAG GAAAT AAACCATCTCAAGCCCAAT TACTACTCGTTCTGG

GTTTCTCGTCAGGGCAAGCC A. TCACTGAATGAGCAGCTTTGTTACGTTGAT TGGGTAATG

AATATCCGGTTCTTGTCAAG TACTC TGATGAAGGTCAGCCAGCC ATGCGCCTGGTCTGTA

CACCGTTCATCTGTCCTCTT CAAAGT GGTCAGTTCGGTTCCCTTA GATTGACCGTCTGCGC

CTCGTTCCGGCTAAGTAACA GGAGCAGG CGCGGATTTCGACACAA TTATCAGGCGATGATA

CAAATCTCCGTTGTACTTTG TCGCGCT GGTATAATCGCTGGGGG CAAAGATGAGTGTTTT

AGTGTATTCTTTCGCCTCTT CGTTTTAGGTTGGTGCCTTCGTAGTGGCATTACGT (ATTTTACC

CGTTTAATGGAAACTTCCTCA GAAAAAG CTTTAGTCCT CAAAGCC CTGTAGCCGTTGCTAC

CCTCGTTCCGATGCTGTCTT CGCTGCTGAGGGTGACGATCCCGCAAAAGCGGCC TTAACTCC

CTGCAAGCCTCAGCGACCGAA ATATCGG TATGCGTGGGCGATGGT GTTGTCATTGTCGGCG

CAACTATCGGTATCAAGCTG TAAGAAA TCACCTCGAAAGCAAGC GATAAACCGATACAAT

TAAAGGCTCCTTTTGGAGCC TTTTTTTGGAGATTTTCAACATGAAAAAATTAT ATTCGCAA

TTCCTTTAGTTGTTCCTTTC ATTC, TCAC CCGCTGAAACTGTTGAAAGTTGTTTAGCAAAACC

CCATACAGAAAATTCATTTAC TAACGTCTGGAAAGACGACAAAACTTTAGATCGT ACGCTAAC

TATGAGGGTTGTCTGTGGAA GCTACAGGCGTTGTAGTTTGTACTGGTGACGAAACTCAGTGTT

ACGGTACATGGGTTCCTA TGGGCTTGCTATCCCTGAAAATGAGGGTGGTGGCTC GAGGGTGG

CGGTTCTGAGGGTGGCGG TCTGAGGGTGGCGGTACTAAACCTCCTGAGTACGGTGATACACCT

ATTCCGGGCTATACTTATATCAACCCTCTCGACGGCACTTATCCGCCTGGTACTGAGCAAAACC

CCGCTAATCCTAATCCTTCTCTTGAGGAGTCTCAGCCTCTTAATACTTTCATGTTTCAGAATAA

TAGGTTCCGAAATAGGCAGGGGGCATTAACTGTTTATACGGGCACTGTTACTCAAGGCACTGAC

CCCGTTAAAACTTATTACCAGTACACTCCTGTATCATCAAAAGCCATGTATGACGCTTACTGGA

ACGGTAAATTCAGAGACTGCGCTTTCCATTCTGGCTTTAATGAGGATCCATTCGTTTGTGAATA

US 2014/003 0697 A1 Jan. 30, 2014 22

0094. The term “viral capsid, as used herein, refers to a reactions, a 100-fold increase in the efficiency of display of protein coat, also sometimes referred to as a protein shell, of GFP onto pVIII is achieved, as described in more detail a virus. The viral capsid encloses the viral genetic material. elsewhere herein. The capsid of most viruses comprises a plurality of oligo 0098. Taking advantage of orthogonal sortases, a plurality meric structural Subunits made of proteins called protomers. of viral capsid proteins can be modified in the same viral The observable 3-dimensional morphological subunits, particle while maintaining excellent specificity of labeling. which may or may not correspond to individual proteins, are The methods provided herein are simple and effective for called capsomeres. Viral capsids can be classified according creating a variety of structures on the Surface of viral par to their structure, e.g., into helical and icosahedral capsids. ticles, e.g., of M13 phage capsid proteins. Some viruses, e.g., bacteriophages, have developed more 0099. The methods, reagents, and kits provided herein can complicated structures. Some viral capsids are enveloped be used to generate complex, virus-templated structures, e.g., with a lipid membrane known as the viral envelope, which is branched concatemers, such as lampbrush structures, that can typically acquired by the capsid from a membrane of the host be engineered to carry out novel functions, e.g., structural cell. functions or the harvesting of light. The methods, reagents, and kits provided herein allow for the use of biological struc DETAILED DESCRIPTION OF CERTAIN tures, e.g., viral particles, as building blocks for the engineer EMBODIMENTS ing of new materials and structures and for the functionaliza 0095. This invention is based, at least in part, on the rec tion of the Surface of Such structures. The methods, reagents, ognition that Sortases can be exploited to conjugate a variety and kits provided herein can also be used to engineer new of moieties to the proteins on the surface of viruses, for functionalities into viral particles, for example, the binding of example, to the capsid proteins of M13 bacteriophage. Such a new spectrum of cells, the interaction with a specific target Sortase-mediated conjugation approaches can be used to con protein, e.g., a specific receptor on the Surface of a cell of fer new functions to viral particles. For example, the conju interest, or the delivery of a payload to a specific type of cell gation of a detectable label allows for the isolation and/or expressing a Surface molecule of interest. Viral particles can quantification of viral particles and can also be used to label be functionalized using the strategies disclosed herein to cells bound or infected by the viral particles. For another attach a cell targeting motif, e.g., a binding agent Such as an example, Sortase-mediated conjugation of binding moieties, antibody, nucleic acid, or a bacterial toxin, to the viral Surface, for example, of antibodies or antibody fragments, nucleic in order to increase the uptake/internalization of the function acids, or of biotin and streptavidin, can be used to confer new alized virus by a specific cell or cell type. In some embodi binding properties to viral particles, e.g., in order to generate ments, the methods and strategies disclosed herein can be complex structures of associated, e.g., concatenated, viral used to generate a viral particle that can bind and deliver its particles. genome to a previously uninfectable host cell, resulting in 0096. Some aspects of this disclosure provide methods, expression of a viral gene product in the host cell. The strat reagents, and kits that can be used to functionalize proteins on egies and methods disclosed herein can also be used to attach the Surface of viruses, for example, by conjugating Such pro a payload, e.g., a functional protein or a small molecule to the teins to a molecule or a plurality of molecules conferring a Surface of a virus that can be delivered upon entry into a target desired function. Examples of such molecules include, with cell. out limitation, detectable labels, small molecules, and bind 0100. The strategies, methods, reagents, and kits disclosed ing agents. The Sortase-mediated techniques described herein herein can also be used to improve the identification of bind allow for functionalization of viral surface proteins with high ing targets in phage display libraries, for example, by using specificity and with efficiencies that Surpass those of any fluorescently labeled phage for the detection of binding known recombinant techniques, such as methods used in the events; to generate functionalized viral particles for use as a context of phage display technology. Another advantage of handle in single molecule force spectroscopy experiments, the methods, reagents, and kits provided herein is that agents allowing, for example, to post-translationally attach properly (e.g., proteins, binding agents, or Small molecules) can be folded complex proteins to the surface of a viral particle; to conjugated to viral Surface proteins that cannot be genetically create complex structures comprising viral particles function encoded, e.g., because of size limitations for insertions into alized with binding agents as building blocks, e.g., using the viral gene or genome encoding a target viral protein to be connections between specific viral capsid proteins; to target modified, or because the agent is not a gene product that can viral particles to specific cells; and to deliver payloads to be encoded by the viral genome. target cells upon binding or infection, e.g., toxic agents such 0097. For example, capsid proteins (e.g., pIII, pIX, and as plant or bacterial toxins, antibiotics, and drugs. pVIII) of bacteriophage M13 can be functionalized, accord ing to some aspects of this disclosure, with entities ranging Sortase-Mediated Functionalization of Viral Capsid Proteins from small molecules (e.g., fluorophores, biotin) to folded 0101 The present invention provides methods, reagents, proteins (e.g., GFP, antibodies, streptavidin) in a site-specific and kits for the functionalization of viral capsid proteins. manner and with yields that Surpass those of any reported Typically, a method of functionalizing a viral capsid protein using phage display technology. A non-limiting example of as provided herein comprises conjugating the target capsid phage protein modification according to Some aspects of this protein with an agent via a sortase-mediated transpeptidation disclosure is the sortase-mediated modification of pVIII, reaction. In order for a sortase-mediated transpeptidation to which is difficult to modify with conventional approaches of be possible, both the target protein and the agent must be genetic engineering or chemical labeling. While a phage vec recognized by the Sortase and must be capable of acting as a tor limits the size of an insert into pVIII to a few amino acids, Substrate of the Sortase in the transpeptidation reaction. a system limits the number of copies actually dis Accordingly, the methods for functionalization of viral played on the Surface of M13 phage. Using Sortase-based capsid proteins provided herein involve viral proteins and US 2014/003 0697 A1 Jan. 30, 2014 agents that comprise or are conjugated to a sortase recogni different viral proteins of a virus. For example, in some tion motif. Some viral proteins and some agents (e.g., pro embodiments, a method is provided that allows for the func teins) may comprise a Suitable Sortase recognition motif. tionalization of 2,3,4,5,6,7,8,9, or different viral proteins. However, in Some embodiments, the target protein and/or the In some embodiments, specific functionalization of a plural agent is engineered to comprise a suitable sortase recognition ity of viral capsid proteins involves the use of different sor motif, for example, via protein engineering (e.g., using tases, each specifically recognizing a different Sortase recog recombinant technologies) or via chemical synthesis (e.g., nition motif. For example, in some embodiments, a first target linking a non-protein agent to a sortase recognition motif). protein is functionalized with SrtA, recognizing the 0102 Typically, a method for viral capsid protein func C-terminal sortase recognition motif LPETGG (SEQID NO: tionalization as provided herein comprises contacting a target 13) and the N-terminal Sortase recognition motif (G), and a protein, e.g., a viral capsid protein comprising a Sortase rec second target protein is functionalized with SrtAs rec ognition motif that is accessible on the Surface of a viral ognizing the C-terminal Sortase recognition motif LPETAA particle, with an agent comprising a Sortase recognition (SEQ ID NO: 12) and the N-terminal sortase recognition motif, in the presence of a sortase under conditions suitable motif (A). The sortases in this example recognize their for the Sortase to conjugate the target protein to the agent via respective recognition motif but do not recognize the other a sortase-mediated transpeptidation reaction. Sortase recognition motif to a significant extent, and, thus, 0103 For example, some embodiments provide methods “specifically recognize their respective recognition motif. In for modifying a target protein, for example, a target viral Some embodiments, a sortase binds a sortase recognition capsid protein, comprising a Sortase recognition motif on the motif specifically if it binds the motif with an affinity that is at Surface of a virus, that includes contacting the target protein least 5-fold, 10-fold, 20-fold, 50-fold, 100-fold, 200-fold, with a Sortase Substrate conjugated to an agent in the presence 500-fold, 1000-fold, or more than 1000-fold higher than the of a Sortase under conditions Suitable for the Sortase to ligate affinity that the sortase binds a different motif. Such a pairing the Sortase Substrate to the target protein. In some embodi of orthogonal Sortases and their respective recognition ments, the target protein comprises an N-terminal sortase motifs, e.g., of the orthogonal Sortase A enzymes SrtA recognition motif, and the Sortase Substrate conjugated to the and SrtAs can be used to site-specifically conjugate agent comprises a C-terminal sortase recognition motif. In two different moieties onto two different capsid proteins (e.g., other embodiments, the target protein comprises a C-terminal a first binding agent to pIII and a second binding agent to Sortase recognition motif, and the Sortase Substrate conju pVIII of M13 bacteriophage particles). In some embodi gated to the agent comprises an N-terminal sortase recogni ments, sortagging of a plurality of different proteins is tion motif. The C- and N-terminal recognition motif are rec achieved by sequentially contacting a virus comprising the ognized as Substrates by the Sortase being employed and different proteins with a first Sortase recognizing a sortase ligated in a transpeptidation reaction. recognition motif of a first target protein and a suitable first 0104. In a given embodiment, whether a viral target pro Sortase Substrate, and then with a second sortase recognizing tein comprises (e.g., is engineered to comprise) a C-terminal a Sortase recognition motif of a second target protein and a oran N-terminal sortase recognition motif will depend on the second Suitable Sortase Substrate, and so forth. Alternatively, accessibility of the C-terminus and/or the N-terminus of the the virus may be contacted with a plurality of sortases in target protein on the surface of the virus. For example, if the parallel, for example, with a first Sortase recognizing a sortase C-terminus of the target protein is accessible on the Surface of recognition motif of a first target protein and a suitable first the virus, e.g., on the Surface of the viral capsid, and the Sortase substrate, and with a second Sortase recognizing a N-terminus is not, then a C-terminal Sortase recognition motif Sortase recognition motif of a second target protein and a is Suitable and vice versa. For example, in Some embodi second suitable sortase substrate, and so forth. It will be ments, an M13 phage is provided that comprises apIII protein understood by those of skill in the art, that suitable orthogonal containing an N-terminal Sortase recognition motif, e.g., an sortases preferentially recognize their own motifs over the N-terminal polyglycine sequence, and is functionalized at the motifs of other sortases, but that a basal level of recognition of N-terminus by contacting it with a sortase Substrate compris other Sortase recognition motifs is not detrimental. For ing a C-terminal Sortase recognition motif, e.g., an LPETG example, SrtAs is able to recognize an LPXTG (SEQ (SEQIDNO: 10) sequence, conjugated to an agent, e.g., GFP, ID NO: 78) motif, but strongly prefers an LPXTA (SEQ ID in the presence of a sortase, e.g., a SrtA, under Suitable NO: 91) motif, while SrtAs shows no cleavage activity conditions for the sortase to conjugate plII and GFP via a for the LPXTA (SEQID NO:91) motif. These two sortases Sortase-mediated transpeptidation reaction. are suitable orthogonal Sortases according to Some aspects of 0105. Whether the C-terminus and/or the N-terminus of a this invention, as are sortases that exclusively recognize their given viral target protein is accessible or not on the Surface of own Sortase recognition sequence. the respective virus will be apparent to those of skill in the art. 0107 For example, in some embodiments, a first viral Many viruses have been sequenced and the structures of the target protein, e.g., M13 pII comprising an N-terminal respective viral capsids have been investigated and can be poly-G sequence, is functionalized using Sortase A from Sta accessed in publicly available databases, such as ENSEMBL phylococcus aureus (SrtA), and a second target protein, (www.ensembl.org) and NCBI (www.ncbi.nlm.nih.gov). e.g., M13 pVIII comprising an N-terminal poly-A sequence, Where structural data is lacking, those of skill in the art will be is functionalized using Sortase A from Streptococcus pyo able to determine the accessibility of the C-terminus and/or genes (SrtA). In some such embodiments, the virus, the N-terminus of a given viral protein on the surface of the e.g., the M13 phage, may be contacted first with SrtA citieifs respective viral capsid with no more than routine experimen (and a suitable substrate) and subsequently with SrtA tation. (and a suitable Substrate), or, since the two sortases are 0106. In some embodiments, methods are provided that orthogonal Sortases, the respective virus may be contacted allow for the functionalization, or Sortagging, of a plurality of with both sortases and both substrates at the same time. US 2014/003 0697 A1 Jan. 30, 2014 24

0108. Any sortases that recognize sufficiently different comprised in the loop structure. In some embodiments, the Sortase recognition motifs with Sufficient specificity are Suit loop structure comprises a protease cleavage site situated able for Sortagging of a plurality of viral proteins of the same between the cysteine residues forming the loop and is, thus, virus. The respective Sortase recognition motifs can be sensitive to cleavage by the protease. In some embodiments, inserted into the target proteins using recombinant technolo cleavage of the engineered viral capsid protein by the pro gies known to those of skill in the art. In some embodiments, tease opens the loop structure. In some embodiments, the Suitable Sortase recognition motifs may be present in a wild loop structure comprises an N-terminal cysteine, a sortase type target protein, for example, an N-terminal polyglycine or recognition motif situated C-terminally of the N-terminal polyalanine sequence, in which case no further engineering of cysteine, a protease cleavage site situated C-terminally of the the target protein may be required. The skilled artisan will Sortase recognition motif, and a C-terminal cysteine. In some understand that the choice of a suitable sortase for the func embodiments, the loop structure comprises an N-terminal tionalization of a given target protein may depend on the cysteine, a protease cleavage site situated C-terminally of the sequence of the target protein, e.g., on whether or not the N-terminal cysteine, a sortase recognition motif situated target protein comprises a sequence at its C-terminus or its C-terminally of the protease cleavage site, and a C-terminal N-terminus that can be recognized as a Substrate by any cysteine. In some embodiments, an amino acid residue, known Sortase. In some embodiments, use of a Sortase that sequence, or structure comprised in the loop structure (e.g., recognizes a naturally-occurring C-terminal or N-terminal the N-terminal cysteine, Sortase recognition motif, protease recognition motif is preferred since further engineering of the cleavage site, and C-terminal cysteine) may be conjugated to target protein can be avoided. another residue, sequence or structure of the loop via a linker, 0109. In some embodiments, a plurality of different target e.g., an amino acid or peptide linker. In some embodiments, proteins is functionalized on the surface of the same viral the linker is a cleavable linker. In some embodiments, the particle. In some embodiments, the different target proteins linker is 3, 4, 5, 6, 7, 8, 9, or 10 amino acid residues long. In are functionalized with different agents. For example, in Some embodiments, the linker comprises more than 10 amino Some embodiments, a first target protein may be functional acids. Suitable protease cleavage sites (and corresponding ized with a first binding agent, and a second target protein proteases cleaving such sites) are described herein. Exem may be functionalized with a second binding agent. One plary Suitable cleavage sites and corresponding proteases example of Such an embodiment is the functionalization of include, e.g., thrombin, TEV protease, Factor Xa, PreScission M13 pIII with biotin and the functionalization of M13 pVIII protease, and papain cleavage sites. Additional Suitable pro with streptavidin on the surface of the same M13 phage teases and cleavage sites will be apparent to the skilled arti particle. Another example of such an embodiment is the func san, and Such suitable proteases and cleavage sites include, tionalization of M13 plII with a nucleic acid molecule, e.g., without limitation, those reported in the passage from para an oligonucleotide, and the functionalization of M13 VIII graph O093 to paragraph O097, and in Table 2 and the with a different nucleic acid molecule, e.g., a different oligo Table following paragraph O097 of U.S. patent application nucleotide. For another example, in Some embodiments, a Ser. No. 13/642.458, publication number US2013/0122043, first target protein is functionalized with a binding agent, and by Guimaraes and Ploegh, the entire contents of which pas a second target protein is functionalized with a detectable sage and tables are incorporated herein by reference. In some label. In some embodiments, a first target protein is function embodiments, the loop structure comprises a bacterial toxin alized with a binding agent, a second target protein is func sequence, e.g., a sequence of a bacterial protein that com tionalized with a detectable label, and a third target protein is prises a loop structure. Exemplary Suitable bacterial toxin functionalized with a click chemistry handle. Additional sequences are described herein, and additional Suitable embodiments in which a plurality of different target proteins sequences will be apparent to those of skill in the art based on is sortagged with a plurality of different agents are provided the instant disclosure. Such suitable sequences include, with herein, and further embodiments will be apparent to those of out limitation, those reported in the passage from paragraph skill in the art based on the present disclosure. It will be 0044 to paragraph 0080 and in paragraph 0175 of U.S. understood that the invention is not limited in the number of patent application Ser. No. 13/642.458, publication number different target proteins to be functionalized nor the number US2013/0122043, by Guimaraes and Ploegh, the entire con of different agents to be conjugated to the target proteins. tents of which passage and paragraph are incorporated herein 0110. In some embodiments, an engineered viral capsid by reference. Exemplary suitable loop structures that are use protein provided herein comprises a sortase recognition ful for engineering viral capsid proteins are disclosed herein, motif, e.g., a C-terminal oran N-terminal Sortase recognition and additional suitable loop structures will be apparent to motif, within a loop structure. In some embodiments, the loop those of skill in the art. Such additional loop structures structure is formed by disulfide bonds between two cysteine include, for example, those reported in U.S. patent applica residues flanking the Sortase recognition motif. In some tion, U.S. Ser. No. 13/642,458, publication number US2013/ embodiments, the loop structure is situated at the N-terminus 0.122043, by Guimaraes and Ploegh, the entire contents of or the C-terminus of the engineered viral capsid protein, or which are incorporated herein by reference. inserted into the sequence of the viral capsid protein near the 0111 Sortases, Sortase-mediated transacylation reactions, N- or the C-terminus (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, less than and their use in transpeptidation (sometimes also referred to 15, less than 20, or less than 25 amino acid residues away as transacylation) for protein engineering are well known to from the N- or C-terminus of the viral capsid protein). In some those of skill in the art (see, e.g., Ploegh et al., International embodiments, the loop structure comprises a cleavable site or PCT Patent Application, PCT/US2010/000274, filed Feb. 1, a cleavable bond, the cleavage of which opens the loop. In 2010, published as WO 2010/087994 on Aug. 5, 2010, and some embodiments, the cleavable bond is a photocleavable Ploegh et al., International PCT Patent Application PCT/ bond. In some embodiments, the cleavable bond is a peptide US2011/033303, filed Apr. 20, 2011, published as WO 2011/ bond, e.g., a peptide bond situated in a protease cleavage site 133704 on Oct. 27, 2011, the entire contents of which are US 2014/003 0697 A1 Jan. 30, 2014 incorporated herein by reference). In general, the transpepti 0119 with a nucleophilic moiety conjugated to an agent, dation reaction catalyzed by Sortase results in the conjugation according to the formula: of a protein containing a C-terminal Sortase recognition motif e.g., LPXTX (wherein each occurrence of X independently represents any amino acid residue), with a peptide comprising HN-c-c=o-sortase recognition motif-Agent an N-terminal Sortase recognition motif, e.g., one or more N-terminal glycine residues. In some embodiments, the Sor wherein tase recognition motif is a sortase recognition motif described 0120 the sortase recognition motif is an N-terminal herein. In certain embodiments, the Sortase recognition motif Sortase recognition motif, for example, a polyglycine is LPXT motif or LPXTG (SEQID NO: 78). (G) or polyalanine (A) motif (wherein n is an integer 0112 The sortase transacylation reaction provides means between 0-100 inclusive); for efficiently linking an acyl donor with a nucleophilic acyl 0121 the agent is acyl, substituted or unsubstituted ali acceptor. This principle is widely applicable to many acyl phatic, substituted or unsubstituted heteroaliphatic, sub donors and a multitude of different acyl acceptors. Previ stituted or unsubstituted aryl, substituted or unsubsti ously, the Sortase reaction was employed for ligating proteins tuted heteroaryl, an amino acid, a peptide, a protein, a and/or peptides to one another, ligating synthetic peptides to polynucleotide, a carbohydrate, a tag, a metal atom, a recombinant proteins, linking a reporting molecule to a pro contrast agent, a catalyst, a non-polypeptide polymer, a tein or peptide, joining a nucleic acid to a protein or peptide, synthetic polymer, a recognition element, a small mol conjugating a protein or peptide to a solid Support or polymer, ecule, a lipid, a linker, or a label; and and linking a protein or peptide to a label. Such products and 0.122 the nucleophilic compound comprises, option processes save cost and time associated with ligation product ally, a linker connecting the agent to the nucleophilic synthesis and are useful for conveniently linking an acyl amine group: donor to an acyl acceptor. However, the modification and I0123 in the presence of a sortase, under conditions suit functionalization of proteins on the surface of viral particles able to form a functionalized viral surface protein of formula: via Sortagging, as provided herein, has not been described previously. 0113 Sortase-mediated transpeptidation reactions (also sometimes referred to as transacylation reactions) are cata lyzed by the transamidase activity of Sortase, which forms a 0.124. In certain embodiments, a sortase-mediated peptide linkage (an amide linkage), between an acyl donor transpeptidation reaction for N-terminal functionalization of compound and a nucleophilic acyl acceptor containing an a viral Surface protein, for example, of an M13 capsid protein, NH-CH2-moiety. In some embodiments, the Sortase is provided that comprises a step of contacting a virus com employed to carry out a sortase-mediated transpeptidation prising a Surface protein comprising an N-terminal sortase reaction is sortase A (SrtA). However, it should be noted that recognition sequence of the structure: any sortase, or transamidase, catalyzing a transacylation reac tion can be used in some embodiments of this invention, as the invention is not limited to the use of sortase A. HN-C-C(=O)-sortase recognition motif - PRT 0114. In certain embodiments, a sortase-mediated transpeptidation reaction for C-terminal functionalization of wherein a viral Surface protein, for example, of an M13 capsid protein, (0.125 PRT is a viral capsid protein: is provided that comprises a step of contacting a virus com 0.126 the sortase recognition motif is an N-terminal prising a Surface protein comprising a C-terminal sortase Sortase recognition motif, for example, a polyglycine recognition sequence of the structure: (G) or polyalanine (A) motif (wherein n is an integer between 0-100 inclusive); with an agent conjugated to a C-terminal Sortase recognition PRT-sortase recognition motif -C=O)-XR motif, of the formula: wherein Agent-sortase recognition motif-co-XR. 0115 PRT is a viral capsid protein: 0116 the sortase recognition motif is a C-terminal sor wherein tase recognition motif, e.g., an LPCXaa)T motif, wherein 0.127 the agent is acyl, substituted or unsubstituted ali Xaa represents any amino acid residue; phatic, substituted or unsubstituted heteroaliphatic, sub stituted or unsubstituted aryl, substituted or unsubsti 0117 X is —O— —NR—, or —S ; wherein R is tuted heteroaryl, an amino acid, a peptide, a protein, a hydrogen, Substituted or unsubstituted aliphatic, or Sub polynucleotide, a carbohydrate, a tag, a metal atom, a stituted or unsubstituted heteroaliphatic; contrast agent, a catalyst, a non-polypeptide polymer, a 10118) R' is H, acyl, substituted or unsubstituted ali synthetic polymer, a recognition element, a small mol phatic, substituted or unsubstituted heteroaliphatic, sub ecule, a lipid, a linker, or a label; stituted or unsubstituted aryl, or substituted or unsubsti 0.128 optionally, wherein the agent is connected to the tuted heteroaryl; nucleophilic amine group via a linker; US 2014/003 0697 A1 Jan. 30, 2014 26

I0129 the sortase recognition motif is a C-terminal sor tase recognition motif, e.g., an LPCXaa)T motif, wherein Xaa represents any amino acid residue; 0.130 X is —O— —NR—, or —S ; wherein R is hydrogen, Substituted or unsubstituted aliphatic, or Sub stituted or unsubstituted heteroaliphatic; and I0131) R' is H, acyl, substituted or unsubstituted ali may replace the C-terminal amino acid of the Sortase recog phatic, substituted or unsubstituted heteroaliphatic, sub nition motif. In some embodiments, the acyl group is stituted or unsubstituted aryl, or substituted or unsubsti tuted heteroaryl; 0132 in the presence of a sortase, under conditions suit able to form a functionalized viral surface protein of formula: OR.

In certain embodiments, R' is substituted aliphatic. In certain embodiments, R' is unsubstituted aliphatic. In some embodi 0133. In some embodiments, the C-terminal sortase rec ments, R' is substituted C-2 aliphatic. In some embodi ognition motif is LPXT, wherein X is a standard or non ments, R' is unsubstituted C-12 aliphatic. In some embodi standard amino acid. In some embodiments, X is selected ments, R' is substituted Caliphatic. In some embodiments, R" is unsubstituted Caliphatic. In some embodiments, R' from D, E, A, N, Q, K, or R. In some embodiments, the is Caliphatic. In some embodiments, R' is butyl. In some recognition sequence is selected from LPXT. LPXT, SPXT, embodiments, R' is n-butyl. In some embodiments, R' is LAXT, LSXT, NPXT, VPXT, IPXT, and YPXR. In some isobutyl. In some embodiments, R' is propyl. In some embodiments, X is selected to match a naturally occurring embodiments, R is n-propyl. In some embodiments, R is transamidase recognition sequence. In some embodiments, isopropyl. In some embodiments, R is ethyl. In some the transamidase recognition sequence is selected from embodiments, R' is methyl. In certain embodiments, R' is LPKT (SEQ ID NO: 93), LPIT (SEQ ID NO:94), LPDT substituted aryl. In certain embodiments, R' is unsubstituted (SEQ ID NO: 95), SPKT (SEQID NO:96), LAET (SEQ ID aryl. In certain embodiments, R is substituted phenyl. In NO:97), LAAT (SEQID NO: 98), LAET (SEQID NO: 99), certain embodiments, R is unsubstituted phenvi. In some embodiments, LAST (SEQID NO: 100), LAET (SEQ ID NO: 101), LPLT the acyl group is (SEQID NO: 102), LSRT (SEQID NO: 103), LPET (SEQID NO: 104), VPDT (SEQ ID NO: 105), IPQT (SEQ ID NO: 106), YPRR (SEQID NO: 107), LPMT (SEQID NO: 108), LPLT (SEQID NO: 109), LAFT (SEQ ID NO: 110), LPQT (SEQ ID NO: 111), NSKT (SEQ ID NO: 112), NPQT (SEQ OMe. ID NO: 113), NAKT (SEQID NO: 114), and NPQS (SEQID NO: 115). In some embodiments, e.g., in certain embodi ments in which Sortase A is used, the transamidase recogni tion motif comprises the amino acid sequence XPXX, 0.134. In some embodiments, the agent to be conjugated to where X is leucine, isoleucine, Valine, or methionine; X is the target protein comprises a protein. In some embodiments, any amino acid; X is threonine, serine, oralanine; P is proline the agent comprises a peptide. In some embodiments, the and G is glycine. In specific embodiments, as noted above, X agent comprises a binding agent. In some embodiments, the is leucine and X is threonine. In certain embodiments, X is agent comprises biotin. In some embodiments, the agent aspartate, glutamate, alanine, glutamine, lysine, or methion comprises Streptavidin. In some embodiments, the agent ine. In certain embodiments, e.g., where sortase B is utilized, comprises an antibody, an antibody chain, an antibody frag the recognition sequence often comprises the amino acid ment, an antibody epitope, an antigen-binding antibody sequence NPXTX, where X is glutamine or lysine; X is domain, a VHH domain, a single-domain antibody, a camelid asparagine or glycine; N is asparagine; P is proline, and T is antibody, a nanobody, or an adnectin. In some embodiments, threonine. The invention encompasses the recognition that the agent comprises a recombinant protein, a protein com selection of X may be based at least in part in order to confer prising one or more D-amino acids, a branched peptide, a desired properties on the compound containing the recogni therapeutic protein, an enzyme, a polypeptide Subunit of a tion motif. In some embodiments, X is selected to modify a multisubunit protein, a transmembrane protein, a cell Surface property of the compound that contains the recognition motif. protein, a methylated peptide or protein, an acylated peptide Such as to increase or decrease solubility in a particular Sol or protein, a lipidated peptide or protein, a phosphorylated vent. In some embodiments, X is selected to be compatible peptide or protein, or a glycosylated peptide or protein. In with reaction conditions to be used in synthesizing a com Some embodiments, the agent is an amino acid sequence pound comprising the recognition motif, e.g., to be unreactive comprising at least 3 amino acids. In some embodiments, the towards reactants used in the synthesis. One of ordinary skill agent comprises a fluorophore, a chromophore, or a fluores will appreciate that, in certain embodiments, the C-terminal cent or phosphorescent moiety, or a radiolabel. In some amino acid of the C-terminal Sortase recognition motif may embodiments, the agent comprises green fluorescent protein. be omitted. For example, an acyl group, e.g., of formula In some embodiments, the agent comprises ubiquitin. In US 2014/003 0697 A1 Jan. 30, 2014 27

Some embodiments, the agent comprises a small molecule. In strated to be capable to efficiently modify surface proteins of Some embodiments, the agent comprises a drug. the bacteriophage M13. However, it will be apparent to those 0135) In certain embodiments, n (designating the number of skill in the art that the methods, reagents, and kits provided of amino acids in the N-terminal Sortase recognition motif) is herein can be used to modify and functionalize Surface pro an integer from 0 to 50, inclusive. In certain embodiments, in teins on other viruses as well. is an integer from 0 to 20, inclusive. In certain embodiments, 0.139 Wild type M13 bacteriophage has a cylindrical n is 0. In certain embodiments, n is 1. In certain embodiments, shape with a length of about 880 nm and a diameter of about n is 2. In certain embodiments, n is 3. In certain embodiments, 6 nm. It encapsulates a single-strand genome that encodes n is 4. In certain embodiments, n is 5. In certain embodiments, five different capsid proteins (FIG. 1A). The body of the n is 6. phage is composed of 2700 copies of pVIII, the major capsid 0.136 Any sortase that can carry out a transpeptidation protein. At one end of the virus, there are ~5 copies of both reaction under conditions suitable for maintaining structural pIII and pVI proteins, and at the other end there are ~5 copies and functional integrity of the viral particle and the viral of both pVII and plX proteins'. capsid protein to be modified can be used this invention. 0140. The capsid proteins of M13 bacteriophage have Examples of suitable sortases include, but are not limited to been used to express combinatorial peptide libraries or pro Sortase A and Sortase B, for example, from Staphylococcus tein variants (ranging from single domains to antibodies) to aureus, or Streptococcus pyogenes. Additional Sortases Suit screen for target ligands in a process known as phage dis able for use in this invention will be apparent to those of skill play. This technique has enabled not only identification of in the art, including, but not limited to any of the 61 Sortases peptides with affinity for biological targets such as proteins, described in Dramsi S. Trieu-Cuot P. Bierne H, Sorting sor cells, and tissues, but also allowed the identification of tases: a nomenclature proposal for the various sortases of biomolecules that bindinorganics’. These molecules, when Gram-positive bacteria. Res Microbiol. 156(3):289-97, 2005, expressed on the M13 capsid proteins, can serve as scaffolds the entire contents of which are incorporated herein by refer for nanowires, structures, and devices'. Functionalization ence. Sortases belonging to any class of sortases, e.g., class A, of a virion capsid such as M13 is currently accomplished class B, class C, and class D Sortases, and Sortases belonging using chemical and/or genetic approaches'"'. However to any sub-family of sortases (subfamily 1, subfamily 2, sub both strategies have limitations. Chemical conjugations are family 3, subfamily 4 and sub-family 5) can be used in this convenient and versatile, but they label motifs found on mul invention. tiple M13 capsid proteins and oftentimes require non-physi 0.137 Any amino acid sequence recognized by a sortase ological pH and reducing conditions that compromise the can be used the present invention. It will be understood by activity of the molecule that is being attached or of the moi those of skill in the art, however, that in order for a certain eties already displayed on other capsid proteins''. Sortase to carry out a transpeptidation reaction, the Sortase 0141 Genetic engineering of phage allows the encoded recognition motif of the target protein to be modified and the protein/peptide to be displayed precisely' ', but it has Sortase recognition motif the agent is conjugated to need to be intrinsic restrictions. Two classes of vectors are available for recognized by that Sortase. Numerous Suitable sortase recog genetic phage display: phagemid and phage. A phagemid nition motifs are provided herein, and additional suitable allows expression of large fusions with any of the five M13 Sortase recognition motifs will be apparent to the skilled phage capsid proteins, but these fusions are incorporated at artisan. Aside from naturally occurring Sortase recognition low efficiency'''. In a phage vector, the M13 bacteriophage motifs. Some embodiments of this invention contemplate the genome is modified directly. As a result, every copy of the use of non-naturally occurring Sortase recognition motifs and recombinant capsid protein incorporated into the virus dis Sortases recognizing Such motifs, for example, Sortase motifs plays the modified protein. However, this strategy does not and sortases described in Piotukh et al., Directed evolution of support display of large moieties’’. pVIII allows the dis sortase A mutants with altered substrate selectivity profiles.J. play of a larger number of recombinant molecules per phage Am ChemSoc. 2011 Nov. 9; 133(44): 17536–9; and Chen I, particle, but it also has the strictest size limitation in phage Dorr BM, and Liu D R. A general strategy for the evolution vector display. pVIII peptide libraries are mostly limited to of bond-forming enzymes using yeast display. Proc Natl sizes of up to 10 amino acids, as phage with longer insertions AcadSci USA. 2011 Jul. 12; 108(28): 11399-404; the entire rarely assemble. Insertions of 6-20 amino acids onto contents of each of which are incorporated herein by refer pVIII are possible using phagemid, but their display is inef ence. In some embodiments, a recognition sequence, e.g., a ficient with less than 25% of the copies of pVIII containing Sortase recognition sequence as provided hereinfurther com the desired fusion product”. Incorporation of proteins is even prises one or more additional amino acids, e.g., at the N less efficient on pVIII: a 23 kDa protein is displayed, on and/or C terminus. For example, one or more amino acids average, on less than a single copy of the pVIII fusion per (e.g., up to 5 amino acids) having the identity of amino acids phage particle using a phagemid vector'. Phage display found immediately N-terminal to, or C-terminal to, a five methods on the pVIII have been able to increase the binding amino acid recognition sequence in a naturally occurring affinity of phage displaying a moiety', but the displayed Sortase Substrate may be incorporated. Such additional amino copy number of the moiety has not been determined. Large acids may provide context that improves the recognition of moieties of at least 23 kDa have been genetically fused to all the recognition motif. four minor capsid proteins using a phagemid vector’’ ‘’’, but only plII has been extensively used in the phage vector Functionalization of M13 Phage Particles system’. However, viability of the resultant phage fusions 0.138. The methods for functionalization of viral proteins does not guarantee that the recombinant peptide?protein of via Sortase-mediated transpeptidation provided herein can be interest displays its native structure and/or maintains its wild used to modify surface proteins on any virus. As described in type function. Both the environment where phage assembles the Examples section herein, the method has been demon and the phage coat protein to which the protein of interest is US 2014/003 0697 A1 Jan. 30, 2014 28 fused may interfere with proper folding'. This is particularly site-specific nature of the reaction fixes the orientation of the critical for enzymes and antibodies as they might not be displayed protein. Fourth, the reactions are performed under functional when incorporated into the phage structure. physiological conditions. Fifth, Sortase reactions afford 0142. The technology provided by this disclosure expands attachment of a wide range of molecules, including those that the versatility of M13 as a display platform, by employing a cannot be genetically encoded Such as fluorophores and strategy based on Sortase-mediated chemo-enzymatic reac biotin. tions to covalently attach a variety of moieties to the N-ter 0145 Some aspects of this description provide reagents minus of plII, pVIII, and pLX. The technology provided and methods to build phage structures that have new material herein allows for the conjugation of functional moieties and and biological applications. Some non-limiting examples are molecules at a high efficiency, as illustrated by a comparison described in detail: the creation of a new lampbrush structure to published labeling data described in more detail in the by fusing different phage particles through plII/pVIII, a fluo Examples section. For example, as described in more detail in rescently labeled phage containing a cell-targeting moiety to the Examples section, the instantly described Sortase-based stain and to sort cells by FACS, and the formation of multiph functionalization technology represents a significant age particles of a specific, predetermined structure via hybrid improvement over current methodologies in the copy number ization-mediated linkage of DNA oligonucleotides conju of displayed peptides and proteins, particularly on pVIII. gated to pIII/pVIII of phage particles. It will be apparent to the 0143 Sortase A enzymes allow modification of proteins skilled artisan that the described examples are illustrative and by enzymatic ligation with a wide range of molecules, moi non-limiting, as various additional applications of the tech eties, and functional groups (including biotin, fluorophores, nology described herein will be apparent to the skilled artisan. and other proteins) at the C-terminus, N-terminus, or at both 0146 In some embodiments, the ability to fluorescently termini of the protein of interest' (see, e.g., Ploegh et al., stain cells can be used in the panning of phage display librar International PCT Patent Application, PCT/US2010/000274, ies against specific cells. Phage particles functionalized with filed Feb. 1, 2010, published as WO/2010/087994 on Aug. 5, fluorescent moieties or proteins allow for more sensitive 2010, and Ploegh et al., International Patent Application PCT/ detection of binding events and/or for decreasing the number US2011/033303, filed Apr. 20, 2011, published as WO/2011/ of panning rounds needed for identifying a biomolecule of 133704 on Oct. 27, 2011, the entire contents of which are interest in phage display screens. incorporated herein by reference). Different sortase enzymes 0147 The ability to generate structures using functional are known to those of skill in the art, and any sortase carrying ized phage as building blocks can be used to produce complex out a transpeptidation reaction can be used in the context of hybrid material structures. For example, in some embodi the instant disclosure. For example, the widely used sortase A ments, functionalized phage particles can be created that can from Staphylococcus aureus (SrtA) recognizes Sub bind to and nucleate different materials, including other strates that contain an LPXTG (SEQID NO: 78) sequence phage particles, organic materials, and inorganic materials. In 38, whereas sortase A from Streptococcus pyogenes (SrtA Some embodiments, hybrid structures of inorganic matter and genes) recognizes substrates with an LPXTA (SEQID NO:91) phage particles can be generated. motif. The sortase enzymes cleave between the threonine 0.148. Some aspects of this invention provide methods for and glycine or alanine residue, respectively, to yield a cova associating viral particles, for example, M13 phage particles, lent acyl-enzyme intermediate that is resolved by nucleo with viral particles of the same type (e.g., with other M13 philic attack of a suitably exposed amine, namely oligogly phage particles), with viral particles of a different type (e.g., cine or oligoalanine-containing peptides in the case of with phage particles of a different strain), or with cells or SrtA, or SrtAs respectively (FIG. 1B). Some other entities (e.g., with target cells, e.g., bacterial cells not aspects of this invention provide methods and protocols using typically bound or infected by wild-type M13 phage, or with a plurality of orthogonal Sortase A enzymes, e.g., SrtA non-target cells, e.g. yeast, insect, or mammalian cells, or and SrtA, to site-specifically conjugate two different with organic particles, e.g., nanoparticles). moieties onto two different capsid proteins (e.g., pIII and 0149 Typically, a method for associating viral particles of pVIII) in a single phage particle. the same type comprises conjugating a first target protein on 0144. The sortase labeling methods provided herein have the surface of the viral particle with a first binding agent via several advantages over genetic and chemical methods. First, Sortase-mediated transpeptidation; conjugating a second tar the Sortase transpeptidation reaction is site-specific. This is get protein on the Surface of the viral particle with a second advantageous, as it allows one to specifically target Sortase binding agent, wherein the second binding agent binds the activity towards a genetically engineered target protein. For first binding agent; and incubating a plurality of viral particles example, in the case of Sortagging of an M13 capsid protein, comprising the first and the second binding agent under con as none of the M13 coat proteins naturally display a sortase ditions suitable for the first and the second binding agent of recognition motif required to participate in Sortase-mediated different viral particles to bind each other. In some embodi reactions, a capsid protein engineered to comprise Such a ments, the first binding agent is a ligand-binding agent, for motif will be specifically targeted by a sortase, while the example, a receptor, or a receptor fragment, and the second non-engineered proteins will not participate in the Sortase binding agent comprises the ligand bound by the ligand reaction. Second, Sortase recognition motifs are small and, binding agent. For example, in Some embodiments, the first therefore, can be easily inserted into the host genome, e.g., the binding agent is biotin, and the second binding agent is M13 phage genome, thus maximizing the number of potential streptavidin. In some embodiments, the first binding agent attachment sites. Third, a protein to be conjugated to a cell comprises an antibody or an antigen-binding antibody frag Surface or particle Surface protein by means of sortase, e.g., a ment, and the second binding agent comprises the antigen protein to be displayed on a phage particle, can be properly bound by the antibody or antibody fragment. In some folded separate from the conjugation reaction, and, as the case embodiments, an M13 capsid protein is Sortagged with a first may be, separate from the assembly of phage particles. The binding agent, e.g., pIII with biotin or a first oligonucleotide, US 2014/003 0697 A1 Jan. 30, 2014 29 and a second M13 capsid protein is Sortagged with a second particles is contacted with a plurality of cells under suitable binding agent binding the first binding agent, e.g., pVIII with conditions. The association of viral particles with other viral streptavidin or a second oligonucleotide. As described in particles of a different type, or with cells, e.g., with cells that more detail elsewhere herein, the M13 particles functional are not naturally bound or infected by the viral particles ized in this manner associate when incubated under Suitable allows for the generation of novel hybrid structures and mate conditions, e.g., under Suitable conditions for biotin and rials the characteristics of which will be determined by the streptavidin to bind or under suitable conditions for the first structure of the associated entities, and by the agents and and second oligonucleotide to become associated with each target proteins used for functionalization of the viral particles. other (e.g., via hybridization to a third oligonucleotide), and can form complex, branched structures not observed in non Functionalized Viral Particles functionalized phage particles. 0152 Some aspects of this invention provide functional 0150. A method for associating viral particles of one type ized viral particles, in which at least one viral capsid protein to viral particles of a different type typically comprises con has been Sortagged according to methods, or using reagents or jugating a target protein on the Surface of a first viral particle strategies provided herein. In some embodiments, the func with a first binding agent via Sortase-mediated transpeptida tionalized virus comprises a target protein, for example, a tion reaction; conjugating a target protein on the Surface of a viral capsid protein, that is conjugated to an agent via a sortase second viral particle with a second binding agent, wherein the recognition motifas described herein. In some embodiments, second binding agent binds the first binding agent directly or the agent is conjugated to the target protein via a linker. In can otherwise become associated with the first binding agent Some embodiments, the linker is a peptide linker, e.g., a linker (e.g., by binding a molecule bound by the first binding agent); comprising a sequence of amino acids. In some embodi and contacting and incubating a plurality of viral particles ments, the linker is a cleavable linker, for example, a linker comprising the first binding agent with a plurality of viral comprising a protease cleavage site, or a photocleavable particles comprising the second binding agent under condi linker. Cleavable linkers including, but not limited to linkers tions suitable for the first and the second binding agent of comprising protease cleavage sites and photocleavable link different viral particles to bind each other. In some embodi ers, are well known to those of skill in the art, and the inven ments, the first binding agent is a ligand-binding agent, for tion is not limited in this respect. In some embodiments, the example, a receptor, or a receptor fragment, or an adhesion agent has been conjugated to the target protein by a Sortase molecule, and the second binding agent comprises the ligand mediated transpeptidation reaction, e.g., by a method pro bound by the ligand-binding agent. For example, in some vided herein. Typically, a sortase-mediated transpeptidation embodiments, the first binding agent is biotin and the second reaction leaves a "scar in the generated protein, which com binding agent is streptavidin. In some embodiments, the first prises the C-terminal Sortase recognition motif (e.g., LPXT, binding agent comprises an antibody or an antigen-binding or any other C-terminal Sortase recognition motif described antibody fragment, and the second binding agent comprises herein) and, in some embodiments, a plurality of N-terminal the antigen bound by the antibody or antibody fragment. In amino acids comprised in the respective N-terminal Sortase some embodiments, an M13 capsid protein of a first M13 recognition motif, e.g., (G), or (A), wherein n is an integer particle is sortagged with a first binding agent, e.g., pIII with equal to or greater than 2. The Sortase recognition motif in the biotin, and a second M13 capsid protein of a second M13 product of the transpeptidation reaction is typically a particle is sortagged with a second binding agent binding the sequence created by the Sortase reaction, e.g., by a SrtAcitieifs first binding agent, e.g., pVIII with Streptavidin. In other mediated transpeptidation reaction or by a SrtA embodiments, the same capsid protein is sortagged with a first transpeptidation reaction. binding agent on a first M13 particle and with a second 0153. In some embodiments, the agent conjugated to the binding agent on a second M13 particle, e.g., pVIII is capsid protein is a protein, a detectable label, a binding agent, sortagged with biotin on a first M13 particle and with strepta a click-chemistry handle, a small molecule, or any other agent vidin on a second M13 particle. The M13 particles function described herein. In some embodiments, the virus comprises alized in this manner are then incubated under conditions a plurality of different target proteins conjugated to an agent Suitable for them to associate, resulting in a branched struc (e.g., different types of target proteins to different agents) via ture of associated, differently sortagged M13 particles. a sortase recognition motif. In some embodiments, different 0151 Viral particles can be functionalized with any suit target proteins of the virus are conjugated to different agents, able binding agent, for example, with a binding agent binding for example, a binding agent and a detectable label; two an antigen or ligand on the Surface of a cell, e.g., a bacterial different detectable labels; a first binding agent, a second cell, a yeast cell, an insect cell, a vertebrate cell, or a mam binding agent, and a detectable label, and so on. In some malian cell. Incubation of the functionalized viral particle embodiments, the different target proteins are conjugated to with the cell results in binding of the functionalized viral the respective agents via Sortase recognition motifs of particle to the cell. In some embodiments, the binding agent is orthogonal sortases. For example, in some embodiments, a biotin/streptavidin. Other suitable binding agents include, virus is provided comprising a first target protein conjugated without limitation, complementary DNA strands, ligands of to a first agent via a SrtA recognition motif, and a second receptors expressed on the Surface of the target cells, and target protein conjugated to a secondagent via a SrtA leucine Zippers. In some embodiments, direct attachment of recognition motif. phage to a cell or other biological structure is effected by 0154. In some embodiments, a functionalized M13 bacte placing a Sortase Substrate on the Surface of the phage, and a riophage is provided that comprises a pII conjugated to an compatible sortase substrate on the surface of the cell or agent via a sortase recognition motif. In some embodiments, biological structure and then effecting a Sortase-mediated a functionalized M13 bacteriophage is provided that com transpeptidation reaction between the two. Association of prises apVIII conjugated to an agent via a sortase recognition viral particles and cells can be achieved if a plurality of motif. In some embodiments, a functionalized M13 bacte US 2014/003 0697 A1 Jan. 30, 2014 30 riophage is provided that comprises a pX conjugated to an teriophage belonging to the family of Myoviridae (e.g., T4 agent via a sortase recognition motif. In some embodiments, phage), Siphoviridae (e.g., W. phage, Bacteriophage T5), the agent is an agent as described herein, for example, a Podoviridae (e.g., T7 phage), Ligamenvirales, Lipothrixviri binding agent or a detectable label. In some embodiments, a functionalized M13 bacteriophage is provided that comprises dae, Rudiviridae, , Bacilloviridae, Bicau a pII conjugated to a first agent, and a pVIII conjugated to a daviridae, Clavaviridae, Corticoviridae, Cystoviridae, Fusell second, different agent. In some embodiments, a functional oviridae, Globuloviridae, Guttavirus, Inoviridae, Leviviridae ized M13 bacteriophage is provided that comprises a plII (e.g., MS2, QB), Microviridae (e.g., dX174), Plasmaviridae, conjugated to a first agent, and a pX conjugated to a second, or Tectiviridae. Exemplary functionalized bacteriophages different agent. In some embodiments, a functionalized M13 provided herein include, without limitation, Lambda phage bacteriophage is provided that comprises a pVIII conjugated (w phage, lysogen), T2 phage, T4 phage, T7 phage, T12 to a first agent, and a pX conjugated to a second, different phage, R17 phage, M13 phage, MS2 phage, G4 phage, P1 agent. In some embodiments, the first agent is a binding agent phage, Enterobacteria phage P2, P4 phage, dX174 phage, N4 (e.g., biotin). In some embodiments, the second agent is a phage, d6 phage, and d29 phage. Further, any virus that may binding agent that binds the first binding agent (e.g., Strepta be functionalized using the methods, reagents, and/or kits vidin). Additional suitable agents include, but are not limited provided herein is within the scope of the present invention, to, click chemistry handles, SNAP-, Clip-, ACP-, and MCP including, but not limited to, those viruses described on pages tags, complementary DNA strands, leucine Zippers, GFP, and 129-653 of Stephen T. Abedon, The Bacteriophages, Oxford toxins, e.g., bacterial and plant toxins. In some embodiments, University Press, USA; 2' edition, Dec. 15, 2005, ISBN: three different target proteins are conjugated to three different 0.195148509; the entire contents of which are incorporated agents, four different agents to four different target proteins, herein by reference. and so on. The invention is not limited in this respect. 0156 Some aspects of this invention provide viruses that 0155 The virus may be any virus suitable for sortase comprise an engineered capsid protein comprising a sortase mediated functionalization as described herein, including, recognition motif, for example, a C-terminal or N-terminal but not limited to, a dsDNA virus comprising a double Sortase recognition motif described herein. Such engineered Stranded DNA genome, an SSDNA virus comprising a single viruses can readily be functionalized according to methods stranded DNA genome, a dsRNA virus comprising a double described herein without the need for further engineering of Stranded RNA genome, a (+)ssRNA virus comprising a single the virus, for example, using recombinant methods. For stranded (+)sense strand RNA genome, a (-)ssRNA virus example, in some embodiments, a phage is provided that comprising a single stranded (-)sense RNA, an SSRNA-RT comprises a capsid protein that does not naturally comprise a Sortase recognition motif at a terminus that is accessible on virus comprising a single-stranded (+)Sense RNA with a the Surface of the phage. In some embodiments, the phage is DNA intermediate genome in its life-cycle that is generated an M13 phage, comprising an engineered capsid protein, for by reverse transcription of the RNA genome, or a dsDNA-RT example, a pII, pVIII, or plX protein comprising a recombi virus. Exemplary functionalized viruses include, e.g., Retro nant poly-glycine or poly-alanine sequence (e.g., (G), or viridae (e.g., lentiviruses such as human immunodeficiency (A), wherein n is equal to or greater than 2 at its N-terminus. viruses, such as HIV-I); Caliciviridae (e.g. Strains that cause 0157 Some aspects of this invention provide nucleic acids gastroenteritis); Togaviridae (e.g. equine encephalitis encoding an engineered capsid protein comprising a sortase viruses, rubella viruses); Flaviridae (e.g. dengue viruses, recognition motif. Such nucleic acids can be used to generate encephalitis viruses, yellow fever viruses, hepatitis C virus); virus particles comprising the engineered capsid proteins, (e.g. coronaviruses); Rhabdoviridae (e.g. which can then be functionalized according to the methods vesicular stomatitis viruses, rabies viruses); Filoviridae (e.g. described herein. In some embodiments, an isolated nucleic Ebola viruses); Paramyxoviridae (e.g. parainfluenza viruses, acid is provided that encodes a viral capsid protein compris mumps virus, measles virus, respiratory syncytial virus); ing an N-terminal or a C-terminal sortase recognition motif. Orthomyxoviridae (e.g. influenza viruses); Bunyaviridae In some embodiments, the nucleic acid is a recombinant nucleic acid. In some embodiments, the Sortase recognition (e.g. Hantaan viruses, bunga viruses, phleboviruses and Nairo motif is inserted into a wild-type nucleic acid sequence viruses); Arenaviridae (hemorrhagic fever viruses); Reoviri encoding the capsid protein. In some embodiments, the dae (erg., reoviruses, orbiviurses and rotaviruses); Birnaviri nucleic acid is comprised in an expression vector. Such vec dae: Hepadnaviridae (Hepatitis B virus); Parvoviridae (par tors are also provided by aspects of this invention. Such voviruses); Papovaviridae (papilloma viruses, polyoma expression vectors typically comprise the encoding nucleic viruses); Adenoviridae: Herpesviridae (herpes simplex virus acid and additional nucleic acid elements mediating the (HSV) 1 and 2, varicella Zoster virus, cytomegalovirus expression and/or replication of the nucleic acid in a host cell, (CMV), EBV, KSV): Poxyiridae (variola viruses, vaccinia for example, a bacterial host cell in the case of bacterioph viruses, pox viruses); and Picornaviridae (e.g. polio viruses, ages. In some embodiments, the expression construct also hepatitis A virus; enteroviruses, human coxsackie viruses, comprises nucleic acid sequences encoding one or more addi tional capsid proteins of the virus. In some embodiments, the rhinoviruses, echoviruses). In some embodiments, the func expression construct encodes at least two engineered capsid tionalized virus provided is a DNA virus. In some embodi proteins, each comprising a sortase recognition motif. In ments, the functionalized virus is a phage, or bacteriophage. Some embodiments, the Sortase recognition motifs comprised In some embodiments, the functionalized virus is a filamen in the at least two engineered capsid proteins are recognized tous phage. In some embodiments, the functionalized virus is by orthogonal Sortases. In some embodiments, proteins an M13 bacteriophage. In some embodiments, the function encoded by the nucleic acids and expression constructs alized virus provided is a bacteriophage, for example, a bac described herein are provided. US 2014/003 0697 A1 Jan. 30, 2014

Kits 0.161. In some embodiments, the kit further comprises a Sortase Substrate. In some embodiments, the Sortase Substrate 0158. Some aspects of this invention provide kits useful comprises a sortase recognition motif conjugated to an agent. for the expression of viral capsid proteins comprising a sor For example, the kit may comprise a sortase Substrate com tase recognition motif, and for the generation of viral particles prising a sortase recognition motif that is compatible with a that can be functionalized via a Sortagging technique Sortase recognition motif encoded by a nucleic acid in the kit described herein. In some embodiments, such a kit comprises in that both motifs can partake in a Sortase-mediated transpep a recombinant nucleic acid encoding a viral capsid protein tidation reaction catalyzed by the same Sortase. For example, comprising a Sortase recognition motif. In some embodi if the kit comprises a nucleic acid encoding a capsid protein ments, the kit further comprises a nucleic acid encoding addi comprising a SrtA N-terminal recognition sequence, the tional viral genes. In some embodiments, the additional viral kit may also comprise SrtA, and a SrtAs Substrate genes may comprise at least one additional capsid protein conjugated to an agent, wherein the Sortase Substrate will comprising a Sortase recognition motif. In some embodi comprise the C-terminal Sortase recognition motif. In some ments, the kit comprises nucleic acid sequences encoding two embodiments, the kit further comprises a buffer or reagent or more capsid proteins comprising different Sortase recog useful for carrying out a sortase-mediated transpeptidation nition motifs. In some embodiments, the different sortase reaction, for example, a buffer or reagent described in the recognition motifs are recognized by orthogonal Sortases, for Examples section. example, one by SrtA and another by SrtAs. In Some embodiments, the kit comprises one or more nucleic 0162 The following working examples are intended to acid molecules that together provide all viral genes necessary describe exemplary reductions to practice of the methods, to generate a viral particle. For example, in Some embodi reagents, and compositions provided herein and do not lim ments, the kit provides a nucleic acid sequence encoding M13 ited the scope of the invention. pIII comprising a sortase recognition sequence (e.g., poly glycine) at its N-terminus, and also one or more nucleic acid EXAMPLES sequences encoding the M13 genome except wild-type pl. In some embodiments, the kit provides a nucleic acid Example 1 sequence encoding M13 pII comprising a Sortase recogni tion sequence (e.g., poly-glycine) at its N-terminus, a nucleic Sortase-Mediated Modification of M13 Phage acid sequence encoding M13 pVIII comprising a sortase rec Surface Proteins ognition sequence (e.g., poly-alanine) at its N-terminus, and one or more nucleic acid sequences encoding the M13 genome except wild-type plII and pVIII. In some embodi Experimental Procedures ments, the kit provides a nucleic acid sequence encoding M13 pVIII comprising a sortase recognition sequence (e.g., poly (0163 Generation of the M13 Phage Constructs. glycine) at its N-terminus, a nucleic acid sequence encoding 0164. The oligonucleotides used to design the different M13 pX comprising a sortase recognition sequence (e.g., phage constructs are compiled in Table 3. The Gs-pII phage poly-alanine) at its N-terminus, and one or more nucleic acid (SEQ ID NO: 77) was engineered by inserting the G5pIIIC sequences encoding the M13 genome except wild-type pVIII and G5pIIINC (SEQID NO: 77) annealed oligonucleotides and plX. into the M13KE vector (New England Biolabs), previously digested with Eagl and AccG5I restriction enzymes. To con 0159. Some kits provided herein comprise the nucleic struct the AG-pVIII phage, the M13SK vector' was acids described herein as part of one or more expression digested with PstI and BamHI restriction enzymes and the constructs. Expression constructs may be in the form of a A2G4pVIIIC (SEQID NO: 9) and A2G4pVIIINC (SEQ ID vector, e.g., a plasmid or phagemid, which can readily be NO: 9) annealed oligonucleotides were inserted. To engineer introduced into a host cell, e.g., a bacterial cell that can be the GHA-pIX construct (SEQ ID NO: 77), the 983 vector infected by a bacteriophage, to generate recombinant viral was used. This vector was created by refactoring the M13SK particles, e.g., M13 particles comprising an M13 pII protein vector so the pX and pVII genes are not overlapping. Upon that contains a Sortase recognition motif. Recombinant phage digestion of this vector with Sfil, the annealed G5HApIXC generated from Such kits can then be functionalized by a and G5HApIXNC (SEQ ID NO: 77) oligonucleotides were Sortagging method described herein. inserted. The Gs-plII-A-pVIII (SEQID NO: 77) phage con struct was created using a modified M13SK vector', which 0160. In some embodiments, the kit further comprises a has a DSPHTELP (SEQID NO: 116) sequence on pVIII and Sortase. Typically, the Sortase comprised in the kit recognizes a biotin acceptor peptide (GLQDIFEAQKIEWHE (SEQ ID a Sortase recognition motif encoded by a nucleic acid com NO: 117)) on plII. Five N-terminal glycines were added to prised in the kit. In some embodiments, the Sortase is pro pIII following the above strategy described for Gs-plII phage vided in a storage Solution and under conditions preserving (SEQ ID NO: 77). The resultant vector was then modified at the structural integrity and/or the activity of the sortase. In the N-terminus of pVIII using the QuikChange II site-di Some embodiments, where two or more orthogonal Sortase rected mutagenesis kit (Stratagene) and the pVIIIAADSPH recognition motifs are encoded by the nucleic acid(s) com oligonucleotide pair. All the generated phage vectors were prised in the kit, a plurality of Sortases is provided, each transformed into the XL-1 Blue bacterial strain, plated in agar recognizing a different Sortase recognition motif encoded by top on LB agar plates containing 1 mM IPTG, 40 ug/mL the nucleic acid(s). In some embodiments, the kit comprises X-Gal, and 30 g/mL tetracycline. Plaques were selected and SrtA citieifs and/or SrtApogees DNA was isolated and sequenced to check for the insertion. US 2014/003 0697 A1 Jan. 30, 2014 32

TABLE 3 Oligonucleotides for phage engineering Name Sequence (5'-3') G5pIIIC GTACCTTTCTATTCTCACTCTGGTGGAGGCGGTGGATC (SEQ ID NO: 1 G5pIIIINC GGCCGATCCACCGCCTCCACCAGAGTGAGAATAGAAAG (SEQ ID NO: 2) A2G4pVIIIC GCTGGCGGGGGAGGG (SEQ ID NO : 3) A2G4pVIIINC GATCCCCTCCCCCGCCAGCTGCA (SEO ID NO :

G5HApIXC CGGCCATGGCGGGCGGAGGTGGAGGCTACCCATACGATGTTCCAGATT ACGCTCAGGG (SEO ID NO : 5) G5HApIXNC TGAGCGTAATCTGGAACATCGTATGGGTAGCCTCCACCTCCGCCCGCC ATGGCCGGCT (SEO ID NO : 6) AADSPH-pVIII-Top GTTCCGATGCTGTCTTTCGCTGCTGCAGATTCGCCGCATACTGAG (SEQ ID NO: 7) AADSPH-pVIII CTCAGTATGCGGCGAATCTGCAGCAGCGAAAGACAGCATCGGAAC Bottom (SEQ ID NO: 8)

0.165 For phage amplification, the E. coli strain ER2738 PAGE gel and analyzed by immunoblot using streptavidin (New England Biolabs) in LB media supplemented with 30 HRP (GE Healthcare). The signal obtained in the phage label ug/mL tetracycline, was infected withphage for at least 12hrs ing reactions was compared with the signal derived from the at 37°C. The cultures were centrifuged at 12000 g for 20 min GFPLPETGGGK(biotin) (SEQ ID NO: 281) calibration and the phage was precipitated from the Supernatant at 4°C. curve allowing us to infer the amount of phage protein labeled with the addition of /s of the supernatant volume of 20% in the reaction. To calculate the labeling efficiency, the PEG8000/2.5MNaCl solution. Upon centrifugation at 13500 amount of labeled protein was divided by the amount of total g for 20 min, the pellet was resuspended in 25 mM Tris, 150 phage protein loaded into the gel. The phage concentration mM NaCl, pH 7.0–7.4 (TBS). For further purification, this was determined by UV-vis spectrometry and it was assumed resuspension was subjected to two rounds of centrifugation/ that there were 2700 copies of pVIII, 5 copies of plII, and 5 precipitation. The final phage concentration averaged copies of pDX per phage particle. between 10'-10' plaque forming units (pfu) per mL as (0170 To determine the yield of GFP-pVIII phage label determined by UV-vis spectrometry'. ing, unincorporated GFP and Sortase was removed from 0166 Sortase-Mediated Reactions. phage by PEG8000/NaCl precipitation. Varying volumes of (0167 SrtA and SrtAs were expressed and puri GFP-pVIII phage and known amounts of GFP were loaded fied as described ''. Sortase reactions were performed as onto the same SDS-PAGE gel and analyzed by immunoblot indicated in the figures. A typical sortase reaction with using an anti-GFP-HRP antibody (Santa Cruz, Biotechnol SrtAs included 200 nM phage, 50 MSrtA, and 50 ogy). The signal of the GFP-pVIII fusion protein was com uM substrate for small peptides or 20 uM for proteins. The pared to the signal of the GFP calibration curve as described reactions were incubated for 3 hrs at 37° C. (for small pep for the biotinylation reactions. For GFP-pIII and GFP-pIX tides) or at room temperature (for proteins) in TBS with 10 labeling, the signal of the fusion protein was compared to the mM CaCl2. SrtA-mediated reactions included 8 nM input amount of plII or pX as detected by anti-plII (New phage, 50 uM SrtA, and 20 uM substrate, incubated England Biolabs) or anti-HA (Roche) antibodies, respec for 3 hr at 37°C. in TBS. Where indicated, phage was purified tively. For GFP-pII, the input signal consisted of only intact by PEG 8000/NaCl precipitation after diluting the reactions pIII molecules and lower molecular weight anti-plII reactive with TBS such that the substrate concentration was below 600 proteins were not included. These proteins can be attributed to nM. proteolyzed plII. Because the anti-plII antibody recognizes 0168 For the flow cytometry experiments, the Gs-plII the C-terminus of the protein, these fragments cannot be A-pVIII (SEQID NO: 77) phage construct was labeled with labeled using SrtA. In all cases the blots were scanned K(TAMRA)-LPETAA (SEQ ID NO: 12) on pVIII. The and densitometric analysis was performed using the Image.J resultant labeled phage was purified by PEG8000/NaCl pre program (National Institutes of Health). The labeling yield cipitation, resuspended in TBS, and split into three parts. One was averaged over three independent reactions with three part remained unlabeled, and the other two were labeled with aliquots from each reaction analyzed. The standard deviation either VHH7.LPETG (SEQID NO: 10) or anti-GFPLPETG of the reactions was calculated from the averages of the three (SEQ ID NO: 10) on plII. As assessed by the anti-pII anti independent reactions. body, a yield of 2.5 antibody molecules per virion was 0171 Dynamic Light Scattering (DLS). achieved in both cases. 0172 DLS measurements were obtained with a Beckman 0169. The yield of the sortase-mediated biotinylation Delsa-Nano C Particle Analyzer (Beckman Coulter Inc). reactions was determined using biotinylated GFP as a stan Phage mixtures were diluted to ~10' pfu/mL in 1 mL of dard. This was prepared labeling GFP comprising a LPETG water and loaded into a cuvette. Samples from each experi (SEQ ID NO: 10) at its C-terminus with a biotin group ment were measured in triplicate and the results were aver using SrtA (GFPLPETGGGK(biotin)) (SEQID NO: aged by cumulant analysis. Autocorrelation functions were 281). Known amounts of the purified GFPLPETGGGK(bi used as a direct comparison of aggregation because aggre otin) standard (SEQID NO: 281) and varying volumes of the gates have a slower Brownian motion causing the signal phage labeling reactions were loaded onto the same SDS correlation to be delayed to longer relaxation times. US 2014/003 0697 A1 Jan. 30, 2014

(0173 Atomic Force Microscopy (AFM). was cloned as a streptavidin.LPETG.HAtag. His (SEQ ID 0.174 Phage preparations were diluted to a concentration NO: 10 and 288) fusion protein using the template Addgene of ~10'pfu/mL, and 100 uL of this mixture were deposited 20860, and expressed as a soluble tetrameric streptavidin'. on a freshly cleaved mica disc. AFM images were taken on a Purification was performed following the same protocol used Nanoscope IV (Digital Instruments) in air using tapping for GFP. Sortase reactions were analyzed on 4-12% Bis mode. The tips had spring constants of 20-100N/m driven Tris SDS-PAGE gels with MES running buffer except for near their resonant frequency of 200-400 kHz (MikroMasch). FIG. 10 which was analyzed on a 12% Laemmli SDS-PAGE Scan rates were approximately 1 Hz. Images were leveled gel. using a first-order plane fit to remove sample tilt. 0181. The K(biotin)-LPETGG (SEQ ID NO: 13), K(bi (0175 Flow Cytometry Analysis. otin)-LPETAA (SEQ ID NO: 12), K(TAMRA)-LPETAA 0176 C57BL/6 mice were purchased from Jackson Labs. (SEQ ID NO: 12), and GGGK(biotin) (SEQ ID NO: 127) Animals were housed at the Whitehead Institute for Biomedi peptides were obtained from the Swanson Biotechnology cal Research and were maintained according to guidelines Center. For mass spectrometry, the protein bands of interest approved by the Massachusetts Institute of Technology were excised, Subjected to protease digestion, and analyzed (MIT) Committee on C