Massively parallel for monitoring genetic consistency and quality control of live viral

Alexander Neverov and Konstantin Chumakov1

Center for Biologics Evaluation and Research, Food and Drug Administration, Rockville, MD 20852

Edited* by Robert H. Purcell, National Institutes of Health, Bethesda, MD, and approved October 6, 2010 (received for review August 24, 2010) Intrinsic genetic instability of RNA may lead to the accumu- uation. MAPREC is currently recommended by the World lation of revertants during manufacture of live viral vaccines, Health Organization (WHO) for screening of batches of OPV requiring rigorous quality control to ensure safety. Each before they can be released for use in humans (11, 18). lot of oral vaccine (OPV) is tested for neurovirulence in Despite high sensitivity and accuracy of MAPREC, it can be animals and also for the presence of neurovirulent revertants. used to monitor only a limited number of genomic loci known to Mutant analysis by PCR and restriction enzyme cleavage (MAPREC) contain the markers of attenuation and misses mutations at other is used to measure the frequency of neurovirulent mutations at the sites, which can also contribute to loss of the attenuated phe- 5′ untranslated region (UTR) of the viral that correlate with notype. For most viruses, determinants of attenuation are un- the level of neurovirulence determined by the monkey neuroviru- known, which limits the utility of this approach for other live viral lence test. However, MAPREC can only monitor mutations at a few vaccines. Nevertheless, ensuring molecular consistency of vac- genomic loci and miss mutations at other sites that could adversely cine batches by requiring that no mutation accumulates beyond affect vaccine quality. Here we propose to use massively parallel the level present in past batches with good clinical record could sequencing (MPS) for sensitive detection and quantification of all help maintain vaccine safety (19). Therefore methods enabling mutations in the entire genome of attenuated viruses. Analysis of detection of unstable genomic loci in attenuated strains could be vaccine samples and reference preparations demonstrated a perfect used for the monitoring of genetic stability. Such studies are agreement with MAPREC results. Quantitative MPS analysis of usually conducted by serial passaging of attenuated strains in validated reference preparations tested by MAPREC produced iden- vivo and in vitro, followed by screening for the presence of small tical results, suggesting that the method could take advantage of numbers of mutants that may have accumulated in viral pop- MICROBIOLOGY the existing reference materials and be used as a replacement for ulations (16, 17, 20, 21). Traditional sequencing approaches can the MAPREC procedure in lot release of OPV. Patterns of mutations detect mutants only if their content exceeds 20–30% and are present at a low level in vaccine preparations were characteristic of therefore of limited utility. Other approaches based on analysis seed viruses used for their manufacture and could be used for iden- of electrophoretic mobility in gels (22, 23) do not always allow fi ti cation of individual batches. This approach may represent the ulti- precise location of mutations. Matrix-assisted laser desorption/ mate tool for monitoring genetic consistency of live viral vaccines. ioniation-time of flight (MALDI-TOF) mass spectrometry (24) and hybridization with microarrays of short oligonucleotides (25, fi mutant quanti cation | quasispecies | sequence heterogeneity | vaccine 26) are more sensitive, but are relatively laborious and may re- safety | neurovirulence quire follow-up by direct sequencing. In the past few years, a set of new technologies for massively igh mutation rates inherent to replication of RNA viruses parallel sequencing (MPS), which are also referred to as high- Hcreate a wide variety of mutants that are present in throughput, next generation, or ultradeep sequencing, have populations, which are often referred to as quasispecies (1). The emerged. They allow rapid generation of extremely large amounts diffuse, “cloud-like” nature of viral populations allows them to of sequence information (27) and are primarily used to sequence rapidly adapt to changing replicative environments by selecting of higher organisms and for metagenomic studies of preexisting variants with better fitness (2–4). Many important virus populations of various bacteria and viruses (28, 29). These new properties cannot be explained by a mere consensus sequence, technologies are also very promising for analysis of viral quasis- but require knowledge about the microvariants present in viral pecies and have been used to study HIV (30, 31), severe acute stocks. One such property is , which is critically relevant to respiratory syndrome (SARS) (32), hepatitis B (33), and other vaccine development and manufacture. RNA viruses (29). MPS was also used for detection of adventitious Attenuated Sabin strains used in manufacture of oral poliovirus agents in vaccines (34). Here we demonstrate that MPS can be vaccine (OPV) (5–7) can regain neurovirulence during growth used to identify very small amounts of mutant viruses in prepara- in vaccine recipients and propagation in cell cultures (8). It tions of live viral vaccines as well as for their accurate quantifica- occurs because of the accumulation of mutants with higher neu- tion. The ability to quantify potentially harmful mutations in rovirulence (9, 10), which include changes in the 5′ untranslated vaccine batches makes this method suitable for quality control region (UTR) (10). Because of this genetic instability, every batch and ensuring manufacturing consistency of live viral vaccines. of OPV must be tested for neurovirulence in monkeys (11) or transgenic mice susceptible to poliomyelitis (12, 13). Previously, we Results developed a highly sensitive molecular method, mutant analysis In the first experiment, pyrosequencing (technology developed by PCR and restriction enzyme cleavage (MAPREC), which en- by , a Roche company, further referred to as abled us to quantify the 5′ UTR revertants in monovalent batches of OPV and demonstrate that their content in vaccine lots directly correlates with the results of the monkey neurovirulence test Author contributions: A.N. and K.C. designed research; A.N. and K.C. performed research; (MNVT) (14). The content of 472-C revertants in type 3 OPV is A.N. and K.C. analyzed data; and A.N. and K.C. wrote the paper. usually below 0.8% in batches of vaccine that passed the MNVT, The authors declare no conflict of interest. whereas batches that failed the test usually contain more than *This Direct Submission article had a prearranged editor. 1% of 472-C revertants (15). Similar methods were developed to Freely available online through the PNAS open access option. quantify mutants with 480-A in Sabin 1 and 481-G in Sabin 2 1To whom correspondence should be addressed. E-mail: [email protected]. strains (16, 17) that were also shown to be determinants of atten- gov.

www.pnas.org/cgi/doi/10.1073/pnas.1012537107 PNAS Early Edition | 1of6 Downloaded by guest on September 24, 2021 454 Life Sciences) was used to analyze full-length PCR ampli- Table 1. Frequencies of mutations identified by pyrosequencing cons prepared from two samples of type 3 OPV: one that passed in plasmid DNA, PCR product, and viruses the monkey neurovirulence test and another that failed. Both Mutation frequencies were made from passage level 3 of the Sabin original seed stock (SO+3). The lot that passed the MNVT was manufactured in Total Insertions primary monkey kidney cells, whereas the failed lot was grown in Substitutions, and WI-38 human diploid cells. Fig. 1A shows that the number of Sample read, 106 % deletions, % times each was determined in positive and cDNA strands was about the same, but varied between 2,000 and 10,000 Plasmid 28.1 0.050 0.424 depending on the genomic location with the total for both PCR product 40.1 0.104 0.860 strands being between 6,000 and 15,000. Both samples contained Plaque 18.2 0.121 0.719 a number of mutants, the most prominent of which was the one HeLa p1 20.7 0.143 0.694 HeLa p7 36.6 0.155 0.858 with C2493U mutation, which changes threonine to isoleucine at amino acid 6 of the capsid protein VP1 (35, 36). The mutation Vero p5 19.0 0.197 1.232 does not affect neurovirulence, but the virus with 2493-U rapidly accumulates upon growth in cultured cells (15). Importantly, in agreement with our previous MAPREC results, there was a dif- 0.05%, whereas there were 0.104% mutations in PCR product ference in the content of 472-C mutants, which correlated with and 0.121% in the rederived virus. Although there was no ap- the MNVT results. The vaccine lot that passed the test contained parent bias for the type of mutations detected in plasmid 0.35% 472-C, whereas the lot that failed it contained 2.4%. DNA (Fig. 2), in all other samples prepared by PCR amplifica- There were a number of other mutations reaching several tion there were significantly more transitions (purine–purine and percent of the population in addition to a low level of less than pyrimidine–pyrimidine changes) than transversions (purine– 0.5% present at all nucleotide positions. This background level pyrimidine and pyrimidine–purine changes). After passaging the may represent true heterogeneity due to the quasispecies nature virus in HeLa cells (seven passages), the average content of of the viral population. However, it could also include mutations mutants increased to 0.155% and after passaging in Vero cells that arose during reverse transcription and PCR amplification or (five passages), was 0.197%. The excess of mutants in the pas- represent the intrinsic inaccuracy of the pyrosequencing method. saged virus could be accounted for by their increase at a select To determine the origin of the background, Sabin 3 cDNA number of genomic loci, including 472-C, 2493-U, and several plasmid, PCR amplicons prepared from the plasmid, and the others, while the overall background level remained relatively virus stock recovered by transfecting HeLa cells with T7 tran- unchanged. Therefore, mutants at roughly 0.1% represent in- scripts of the plasmid, both before and after passaging in Vero trinsic errors of sequencing procedures and mutations in- cells, were analyzed by MPS. Table 1 shows that there was troduced by PCR amplification. Mutants in excess of this level a difference in the level of background sequence heterogeneity may reflect the true sequence heterogeneity of viral quasispecies. detected in the plasmid DNA, PCR amplicons, and virus stocks. It is obvious from the results in Table 1 that there was a sig- The average frequency of mutations in DNA plasmids was nificant number of deletions, which exceeded the number of nucleotide substitutions. Because poliovirus contains one long

Fig. 1. MPS analysis of two batches of type 3 OPV performed by pyrose- quencing. (A) The number of times each nucleotide was read in forward (green) and reverse (red) orientations. (B and C) Mutational profiles for vaccine batches that failed and passed the MNVT, respectively. Here and in Fig. 2. Frequency of different types of nucleotide substitutions detected by all other figures the contents of mutants is shown by colored bars: mutations MPS (Roche 454) analysis in plasmid DNA (A), PCR-amplified DNA (B), and to A shown in orange, mutations to C in red, mutations to G in blue, and virus rederived from T7 transcript of the plasmid (C). Bars represent the mutations to U in green. frequency of all possible transitions and transversions.

2of6 | www.pnas.org/cgi/doi/10.1073/pnas.1012537107 Neverov and Chumakov Downloaded by guest on September 24, 2021 ORF, the presence of insertions or deletions suggested a meth- odological artifact. To study this phenomenon, we analyzed a set of eight poliovirus samples by both pyrosequencing and Illumina methods. Various characteristics of both datasets shown in Table 2 demonstrate that the average sequence length was significantly higher for pyrosequencing (195 vs. 38), but the Illumina method produced roughly 60 times more sequence information. There- fore, the average number of times each nucleotide was se- quenced was 885 for pyrosequencing and 56,650 for Illumina. The number of substitutions detected by each method was roughly the same, but the number of insertions and deletions was fi Fig. 3. Patterns of mutations in the vicinity of nucleotide 472 in Sabin 3 signi cantly lower in the samples analyzed by Illumina: 0.01% vs. genome revealed by pyrosequencing. “Passed” WHO reference for MAPREC 1.49% in pyrosequencing. Therefore, although both methods 96/572 (A); US National neurovirulence reference for type 3 OPV NC2 (B), and appear suitable for analysis of viral quasispecies, Illumina pro- “failed” WHO reference for MAPREC 96/578 (C). duced a better quality and greater number of sequences and thus may be the preferred method. Results shown in Fig. 1 suggest that MPS could be used for the test. The pattern of mutations present in vaccines made from the quantification of the proportion of the critically important 472-C same seed virus was highly reproducible. This is illustrated by mutants, which determine the level of neurovirulence of the a cluster of mutations in the viral 3D polymerase gene (Fig. 5). virus stock. To assess the accuracy of this quantification, US The mutations A5467G, U5473C, and U5476C (marked 1, 2, and 3 neurovirulence reference NC2 (that by definition represents in the figure) are located in the third positions of codons and are marginally acceptable vaccine) was tested along with two WHO silent. Mutations U5473C and U5476C (2 and 3) were present in International reference preparations for MAPREC: one that the SO+2 stock (Fig. 5A) from which all SO-based vaccines were represents acceptable (96/572) and another unacceptable vaccine made. These mutations were never present in the same RNA (96/578). By design, these WHO reference preparations for molecule; therefore, the proportion of the virus population that MAPREC bracket the proportion of mutants in the NC2 refer- contains these mutations is greater than 10%. A vaccine made in ence. Fig. 3 shows profiles of the sequence heterogeneity of these primary monkey kidney cells at SO+3 level by manufacturer A samples in the vicinity of nucleotide 472. The contents of 472-C contains about the same amount of these two mutants (Fig. 5B), MICROBIOLOGY in NC2 as determined from MPS data were 0.687% and was very whereas vaccine made by manufacturer B contains one addi- close to our historical MAPREC data (0.729 ± 0.156%). The tional A5467G mutant (Fig. 5C). Passaging of this vaccine 10 contents of 472-C for the 96/572 reference was 0.556% (0.642 ± times in Vero cells did not result in further accumulation of these 0.104% according to MAPREC) and for the 96/578 reference variants, and their content even decreased slightly (Fig. 5D). A was 1.072% (1.010 ± 0.161% according to MAPREC). There- vaccine lot made in human diploid cells by manufacturer C fore, there was a very close match between our previous MAP- contained a significantly increased amount of A5467G (Fig. 5E). REC data and the results obtained by MPS. Not surprisingly, vaccine prepared from plaque-purified RSO There was also good agreement between the contents of seed virus (Fig. 5F) did not contain mutants 2 or 3. However, it mutants at another unstable genomic position, 2,493, which was contained a small amount of mutant 1, which probably accumu- mentioned above. Although this mutation does not contribute to lated in the course of limited passaging used to prepare master neurovirulence, its content in vaccines represents a marker of and working seed viruses and to produce a lot of vaccine. This seed virus. Vaccines produced from the so-called Sabin original vaccine batch was made by manufacturer B, similar to the batch (SO) seed typically contain from 30 to 90% of these mutants, shown in Fig. 5C, suggesting that production conditions used by whereas vaccines prepared from plaque-purified rederived Sabin this manufacturer favor selection of this mutant. Fig. 5 G and H original (RSO, or “Pfizer”) seed stock contain mostly 2,493-C shows results with vaccines made by manufacturers D and E, and only 1–4% of 2,493-U (15). Fig. 4 shows the results of MPS which use plaque-purified Sabin 3 seed viruses different from analysis of NC2 prepared from SO seed virus and a batch of RSO. One of these vaccines did not harbor any of the three RSO-based vaccine. There was 45.3% of 2,493-U in NC2 and mutants, whereas another contained a small amount of mutant 1 2.7% in the RSO vaccine lot according to the MPS analysis, very and 100% of mutant 2 plus one other mutant A5482U, which close to our previous MAPREC data. The profile of other were presumably selected at random during plaque purification mutations present in SO- and RSO-based vaccines was different, and amplification. Thus, analysis of nucleotide heterogeneities suggesting that the method could be used as a seed virus identity could be used to identify the origin of vaccine stocks and to

Table 2. Comparison of pyrosequencing and Illumina methods for MPS Number of Total nucleotides Average coverage % mutations % insertions Average length sequences, 103 read, 106 per nucleotide (substitutions) and deletions

Sample Roche 454 Illumina Roche 454 Illumina Roche 454 Illumina Roche 454 Illumina Roche 454 Illumina Roche 454 Illumina

1 218 38 69 25,177 11.5 850.6 774 57,154 0.20 0.30 1.17 0.01 2 199 38 101 26,361 15.2 881.8 1,025 59,254 0.23 0.20 1.61 0.01 3 223 38 108 26,398 18.6 912.5 1,254 61,390 0.17 0.19 1.75 0.01 4 179 38 77 26,272 10.5 868.5 709 58,429 0.21 0.18 1.67 0.00 5 196 38 95 25,667 14.2 850.9 956 57,246 0.18 0.16 1.26 0.00 6 207 38 79 26,065 12.3 719.8 830 48,423 0.22 0.19 1.40 0.01 7 159 38 79 24,359 9.2 855.8 616 57,577 0.23 0.17 1.55 0.00 8 180 38 102 24,669 13.6 799.6 917 53,796 0.26 0.30 1.47 0.01 Average 195 38 89 25,621 13.2 842.4 885 56,659 0.21 0.21 1.49 0.01

Neverov and Chumakov PNAS Early Edition | 3of6 Downloaded by guest on September 24, 2021 One critical advantage of MPS over MAPREC is that it allows all nucleotide positions in complete viral genomes to be screened in one assay and therefore addresses concerns that mutations at unknown genomic loci could emerge but remain undetected and thus compromise vaccine quality. We were able to distinguish patterns of mutations that were characteristic for specific seed virus; therefore, groups of vaccine lots produced from the same seed could be identified by the specific patterns of mutations that they contain. For instance, lots produced from the Sabin original seed stock of type 3 poliovirus contained three closely located mutations in the 3D polymerase gene: A5467G, U5473C, and U5476C. These three silent mutations were never present in the same RNA molecule, and together they constitute about 10–20% of the virus Fig. 4. Mutational profiles of two batches of type 3 OPV prepared from SO population. They do not appear to provide a clear or universal seed virus (A) and plaque-purified RSO seed virus (B). selective advantage in vitro because after 10 passages in Vero cells, their contents remained mostly unchanged. However, at least one of the mutants (A5467G) was consistently and independently se- monitor consistency of vaccine production conditions to avoid lected under manufacturing conditions used by two vaccine man- selection of mutants. ufacturers. This example shows that MPS could be used to detect the presence of a low frequency of signature mutations that reflect Discussion subtle differences in manufacturing conditions and therefore be In this communication we report the use of MPS for analysis of used for monitoring molecular consistency of viral vaccines. Unlike mutants present in OPV and demonstrate that the method was OPV, other live viral vaccines are not routinely tested for residual highly sensitive and allowed us to simultaneously detect and neurovirulence at the level of individual vaccine batches, but phe- accurately quantify mutants at all positions of the entire viral notypic tests to ensure genetic consistency of vaccine manufacture genome. Close match between the MPS results obtained for the are an important part of qualification of seed viruses and are per- calibrated reference materials previously tested in MAPREC formed when significant changes to production conditions are in- suggested that MPS could be used instead of MAPREC for lot troduced. Profiles of mutations present in vaccine batches could be release of OPV and take advantage of the existing validated re- analyzed by MPS and used to ensure consistency in the manuf- ference reagents. In a recent report, OPV and other viral vac- acturing process, which is a cornerstone of vaccine safety. The cines were analyzed by MPS for the presence of sequence het- ability of MPS to measure small quantities of mutants could also erogeneities, but the authors failed to detect any connection be used in basic virus research to study effects of different muta- between the presence of minority sequence variants and vaccine tions on virus fitness by monitoring their selection at different quality (34). This may have been due to either insufficient sen- growth conditions. sitivity of the MPS method used, or the absence of reference Analysis of a plasmid DNA, a PCR product amplified from it, as materials previously tested by other methods. In either case the well as virus stocks recovered from the plasmid after T7 tran- present communication uniquely reports the utility of MPS for scription revealed that there was on average 0.05% of mutations in quality control of viral vaccines. the homogeneous plasmid DNA preparation, presumably due to

Fig. 5. Patterns of silent mutations in a region of 3D/polymerase gene. Vaccines are: SO+2 (A); SO+3, manufacturer A (B); SO+3, manufacturer B (C); same as C but passaged 10 times in Vero cells (D); SO+3, manufacturer C (E); RSO, manufacturer B (F), plaque-purified seed stock, manufacturer D (G); plaque-purified seed stock, manufacturer E (H).

4of6 | www.pnas.org/cgi/doi/10.1073/pnas.1012537107 Neverov and Chumakov Downloaded by guest on September 24, 2021 the inherent rate of base-calling errors. This number is consistent mini kit (Invitrogen). Isolated RNA was reverse transcribed by ThermoScript − with the average Phred sequence quality value for the data ana- RNase H reverse transcriptase (Invitrogen) using A3-As primer (T30CCTCCG- lyzed in this work. The rate of mutations in the PCR amplicon AATTAAAGAAAAATTTACCCCTAC). The reaction was carried out for 5 h at μ produced from the plasmid was higher (about 0.1%). Prevalence 55 °C in a total volume of 20 L using conditions recommended by the man- fi ufacturer. After completion of the RT step, residual template DNA was re- of transitions over transversions in all PCR-ampli ed samples fi fl moved by digestion with 2 U of RNase H (Invitrogen). cDNA was PCR ampli ed probably re ects the intrinsic bias of nucleotide misincorporation either as a full-length amplicon or in two partially overlapping amplicons. Full- typical for most polymerases. The background level of mutations length amplification was performed as previously described (37). For two- produced by PCR amplification and base-calling errors inherent to amplicon reactions, PCR primers S1-T7: GCGGCCGCTAATACGACTCACTATAG- any sequencing technology puts a lower limit on the number of GTTAAAACAGCTCTGGGGTTG and 3797R: CCCTGCTCCATGGCCTCTTCCTCGT- mutations that could be detected by this method. AAGC were used for amplification of the left half of the genome and primers Two versions of MPS were used in this study: pyrosequencing 3734F: ATCGGAATCGTGACAGCTGGTGGAGAGGG and S3-R: CCTCCGAATTA- (454 Life Sciences, a Roche company) and Illumina methods. AAGAAAAATTTACCCCTACAACAGTATGACCCAATCC were used for the right Whereas pyrosequencing produced longer sequences, the Illu- half. In both cases, reactions were performed with Platinum PCR SuperMix high fidelity kit (Invitrogen). PCR products were purified using QIAquick PCR mina approach generated roughly 60 times more sequence in- purification kit (Qiagen), and their quality was checked by agarose gel elec- formation, allowing more sensitive detection of minority sequence trophoresis and by measuring absorbance at 260 and 280 nm using a Nano- variants. Although the accuracy of base calls was roughly similar, Drop 1000 spectrophotometer (Thermo Scientific). Two half-length PCR pyrosequencing erroneously identified a significant number of products were mixed in equimolar proportion and subjected to MPS. insertions and deletions, especially in regions where the same base Massively parallel sequencing was performed by either pyrosequencing is repeated several times. This shortcoming is inherent to the (454 Life Sciences technology) (38) or Illumina/Solexa method (39). Pyrose- base-calling method used in pyrosequencing and was largely quencing was done at the Laboratory of Molecular Technology, Advanced compensated by software used for data processing. Therefore, the Technology Program, SAIC (Frederick, MD). An experimental format identical ’ Illumina approach may be preferable, but pyrosequencing can to the manufacturer s protocol for de novo genome sequencing was used fi throughout the study. DNA was sheared using an S2 sonicator (Covaris) to also be successfully used for screening and quanti cation of mu- produce DNA molecules with an average length of 500 bp. To avoid contam- tants in viral vaccines. In this work, we have used both full-length ination, each sample was sonicated in a single-use tube. Libraries for se- PCR amplification, as well as amplification of two overlapping quencing were prepared using GS FLX Titanium Rapid Library preparation kit segments that covered the entire genome; there was no difference (454 Life Sciences). Libraries were clonally amplified by emulsion PCR (emPCR) in the results between these two approaches. with GS FLX Titanium MV emPCR kit (Lib-L) (454 Life Sciences). Beads with fi Practical aspects of implementing this approach must be ampli ed libraries were loaded onto GS FLX Titanium PicoTiterPlate with MICROBIOLOGY considered. The MPS equipment is highly sophisticated, expen- dividers with separate reaction chambers to accommodate eight individual sive, and requires maintenance and operation by trained per- samples. Sequencing reactions were carried out using FLX Genome Sequencer sonnel. Although it is still relatively expensive, its cost, which (454 Life Sciences) with GS FLX Titanium reagents (454 Life Sciences). Initial data treatment (image acquisition, base calling, quality estimation) and data varies depending on the number of samples being analyzed in fi fi extraction and ltering were performed using software supplied by 454 one run, is signi cantly lower that that of neurovirulence tests. It Life Sciences with default settings. is widely expected that the cost of MPS will undergo a dramatic Analysis using the Illumina/Solexa platform was performed at Macrogen. reduction with the introduction of next generation methods DNA shearing, library preparation, and cluster generation were conducted based on ion torrent technology or single-molecule sequencing. according to the manufacturer’s recommendations. Sequencing reactions In conclusion, making MPS analysis a part of vaccine lot re- were performed on a Genetic Analyzer IIx (Illumina) using a 38-cycle setting lease, characterization of new vaccine strains, and qualification with a single-read option. Each sample was loaded onto a single line of the of seed stocks promises to significantly increase molecular con- flow cell for a total of eight samples per run. Initial data analysis was per- ’ sistency and therefore ensure the safety of live viral vaccines, and formed by the instrument s software, using default options. also move them one step closer to being well-characterized Bioinformatic Treatment. MPS data were processed using custom software biological products. developed in this laboratory. Sequence data with Phred quality scores below Materials and Methods 20 were discarded. Newly generated sequences were aligned with amplicon sequences, and mutations at each nucleotide were recorded. Results with Virus Preparations and Growth. Reference preparations of OPV were obtained significant strand polarity bias, as well as insertions and deletions adjacent to from WHO and Center for Biologics Evaulation and Research (CBER) re- homopolymeric stretches in pyrosequencing data were disregarded to reduce positories, and monovalent vaccine lots were obtained from their manu- the effect of sequencing artifacts. The results were used to construct plots facturers. Serial virus passaging was performed in HeLa and Vero cells by showing the representation of each mutation in the genome. inoculating cell cultures at the multiplicity of infection of about 1 TCID50/cell. ACKNOWLEDGMENTS. We thank Dr. Ellie Ehrenfeld, Dr. Steven Rubin, and Preparation of DNA for Analysis. Viral RNA was isolated from cell culture Dr. Keith Peden for reading the manuscript. The work was supported by supernatants or from monovalent vaccine lots using PureLink viral RNA/DNA funds from the Food and Drug Administration’s intramural research program.

1. Domingo E, et al. (1985) The quasispecies (extremely heterogeneous) nature of viral the human alimentary tract. Pan-American Health Organization (PAHO, Washington, RNA genome populations: Biological relevance—a review. Gene 40:1–8. DC), pp 179–198. 2. Domingo E, et al. (2006) Viruses as quasispecies: Biological implications. Curr Top 9. Cann AJ, et al. (1984) Reversion to neurovirulence of the live-attenuated Sabin type 3 Microbiol Immunol 299:51–82. oral poliovirus vaccine. Nucleic Acids Res 12:7787–7792. 3. Novella IS, Domingo E, Holland JJ (1995) Rapid viral quasispecies evolution: 10. Evans DM, et al. (1985) Increased neurovirulence associated with a single nucleotide change Implications for vaccine and drug strategies. Mol Med Today 1:248–253. in a noncoding region of the Sabin type 3 poliovaccine genome. Nature 314:548–550. 4. Ruiz-Jarabo CM, Arias A, Baranowski E, Escarmís C, Domingo E (2000) Memory in viral 11. World Health Organization (2002) Recommendations for the production and control quasispecies. J Virol 74:3543–3547. of poliomyelitis vaccine (oral). WHO Technical Report Series (World Health Organi- 5. Sabin AB (1955) Characteristics and genetic potentialities of experimentally produced zation, Geneva, Switzerland), pp 31–93. and naturally occurring variants of poliomyelitis virus. Ann NY Acad Sci, 61(4):924–938 12. Dragunsky E, et al. (1996) A poliovirus-susceptible transgenic mouse model as discussion, 938–939. a possible replacement for the monkey neurovirulence test of oral poliovirus vaccine. 6. Sabin AB (1959) Present position of immunization against poliomyelitis with live virus Biologicals 24:77–86. vaccines. BMJ 1:663–680. 13. Levenbook I, Dragunsky E, Pervikovc Y (2000) Development of a transgenic mouse 7. Sabin AB (1985) Oral poliovirus vaccine: History of its development and use and neurovirulence test for oral poliovirus vaccine: International collaborative study 1993- current challenge to eliminate poliomyelitis from the world. JInfectDis151:420– 1999. Vaccine 19:163–166. 436. 14. Chumakov KM, Powers LB, Noonan KE, Roninson IB, Levenbook IS (1991) Correlation 8. Benyesh-Melnick M, Melnick JL (1959) The use of in vitro markers and monkey between amount of virus with altered nucleotide sequence and the monkey test for neurovirulence tests to follow genetic changes: In attenuated poliovirus multiplying in acceptability of oral poliovirus vaccine. Proc Natl Acad Sci USA 88:199–203.

Neverov and Chumakov PNAS Early Edition | 5of6 Downloaded by guest on September 24, 2021 15. Chumakov KM, et al. (1992) RNA sequence variants in live poliovirus vaccine and their 27. Rogers YH, Venter JC (2005) : Massively parallel sequencing. Nature 437: relation to neurovirulence. J Virol 66:966–970. 326–327. 16. Rezapkin GV, et al. (1994) of Sabin 1 strain in vitro and genetic 28. Eisen JA (2007) Environmental shotgun sequencing: Its potential and challenges for stability of oral poliovirus vaccine. Virology 202:370–378. studying the hidden world of microbes. PLoS Biol 5:e82. 17. Rezapkin GV, et al. (1995) Microevolution of type 3 Sabin strain of poliovirus in cell 29. Victoria JG, et al. (2009) Metagenomic analyses of viruses in stool samples from cultures and its implications for oral poliovirus vaccine quality control. Virology 211: children with acute flaccid paralysis. J Virol 83:4642–4651. 377–384. 30. Rozera G, et al. (2009) Massively parallel pyrosequencing highlights minority variants 18. Wood DJ, Heath AB, Chumakov KM, Norwood L, Pomeroy K (1996) Molecular analysis in the HIV-1 env quasispecies deriving from lymphomonocyte sub-populations. of the quality control of oral poliovirus vaccine. Proposal to establish an International Retrovirology 6:15. Standard for MAPREC analysis of poliovirus type 3 (Sabin). WHO Publication BS/96 31. Tsibris AM, et al. (2009) Quantitative deep sequencing reveals dynamic HIV-1 escape – 1841:1 44. and large population shifts during CCR5 antagonist therapy in vivo. PLoS ONE 4: 19. Chumakov KM (1999) Molecular consistency monitoring of oral poliovirus vaccine and e5683. other live viral vaccines. Dev Biol Stand 100:67–74. 32. Eckerle LD, et al. (2010) Infidelity of SARS-CoV Nsp14-exonuclease mutant virus 20. Chumakov KM, et al. (1994) Consistent selection of mutations in the 5′-untranslated replication is revealed by complete genome sequencing. PLoS Pathog 6:e1000896. region of oral poliovirus vaccine upon passaging in vitro. J Med Virol 42:79–85. 33. Margeridon-Thermet S, et al. (2009) Ultra-deep pyrosequencing of hepatitis B virus 21. Taffs RE, et al. (1995) Genetic stability and mutant selection in Sabin 2 strain of oral quasispecies from nucleoside and nucleotide reverse-transcriptase inhibitor (NRTI)- poliovirus vaccine grown under different cell culture conditions. Virology 209: treated patients and NRTI-naive patients. J Infect Dis 199:1275–1285. 366–373. 34. Victoria JG, et al. (2010) Viral nucleic acids in live-attenuated vaccines: Detection of 22. Orita M, Iwahana H, Kanazawa H, Hayashi K, Sekiya T (1989) Detection of minority variants and an adventitious virus. J Virol 84:6033–6040. polymorphisms of human DNA by gel electrophoresis as single-strand conformation 35. Weeks-Levy C, et al. (1991) Identification and characterization of a new base polymorphisms. Proc Natl Acad Sci USA 86:2766–2770. – 23. Muyzer G, de Waal EC, Uitterlinden AG (1993) Profiling of complex microbial substitution in the vaccine strain of Sabin 3 poliovirus. Virology 185:934 937. populations by denaturing gradient gel electrophoresis analysis of polymerase chain 36. Chumakov K, et al. (1992) RNA sequence variability in attenuated poliovirus. Vaccines ’ reaction-amplified genes coding for 16S rRNA. Appl Environ Microbiol 59:695–700. 92, eds Brown F, Chanock R, Ginsberg H, Lerner R (Cold Spring Harbor Laboratory – 24. Amexis G, et al. (2001) Quantitative mutant analysis of viral quasispecies by chip- Press, Cold Spring Harbor, NY), pp 331 335. based matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. 37. Chumakov KM (1996) PCR engineering of viral quasispecies: A new method to Proc Natl Acad Sci USA 98:12097–12102. preserve and manipulate genetic diversity of RNA virus populations. J Virol 70: 25. Proudnikov D, et al. (2000) Analysis of mutations in oral poliovirus vaccine by 7331–7334. hybridization with generic oligonucleotide microchips. Biologicals 28:57–66. 38. Ronaghi M, Uhlen M, Nyren P (1998) A sequencing method based on real-time 26. Cherkasova E, et al. (2003) Microarray analysis of evolution of RNA viruses: Evidence pyrophosphate. Science, 281(5375): 363, 365. of circulation of virulent highly divergent vaccine-derived . Proc Natl Acad 39. Mardis ER (2008) Next-generation DNA sequencing methods. Annu Rev Genomics Sci USA 100:9398–9403. Hum Genet 9:387–402.

6of6 | www.pnas.org/cgi/doi/10.1073/pnas.1012537107 Neverov and Chumakov Downloaded by guest on September 24, 2021