<<

(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization International Bureau (10) International Publication Number (43) International Publication Date WO 2015/157189 Al 15 October 2015 (15.10.2015) P O P C T

(51) International Patent Classification: AO, AT, AU, AZ, BA, BB, BG, BH, BN, BR, BW, BY, A61K 39/12 (2006.01) A61K 39/145 (2006.01) BZ, CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, A61K 39/385 (2006.01) A61K 38/00 (2006.01) DO, DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, HN, HR, HU, ID, IL, IN, IR, IS, JP, KE, KG, KN, KP, KR, (21) International Application Number: KZ, LA, LC, LK, LR, LS, LU, LY, MA, MD, ME, MG, PCT/US2015/024563 MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, OM, (22) International Filing Date: PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, SC, 6 April 2015 (06.04.2015) SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW. (25) Filing Language: English (84) Designated States (unless otherwise indicated, for every (26) Publication Language: English kind of regional protection available): ARIPO (BW, GH, (30) Priority Data: GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, ST, SZ, 61/976,024 7 April 2014 (07.04.2014) US TZ, UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, RU, 62/073,717 31 October 20 14 (3 1.10.20 14) US TJ, TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, 62/073,777 31 October 20 14 (3 1.10.20 14) us DK, EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU, 62/127,775 3 March 2015 (03.03.2015) us LV, MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, (71) Applicant: THE REGENTS OF THE UNIVERSITY GW, KM, ML, MR, NE, SN, TD, TG). OF CALIFORNIA [US/US]; 1111 Franklin Street, Twelfth Floor, Oakland, California 94607-5200 (US). Declarations under Rule 4.17 : — as to applicant's entitlement to apply for and be granted a (72) Inventors: SUN, Ren; c/o Ucla Office of Intellectual Prop patent (Rule 4.1 7(H)) erty Administration, 11000 Kinross Ave., Ste 200 MC 140607, Los Angeles, California 90095-1406 (US). — as to the applicant's entitlement to claim the priority of the YOUNG, Arthur; c/o InvVax, Inc., 570 Westwood Plaza, earlier application (Rule 4.1 7(in)) Bldg. 114, Los Angeles, California 90095 (US). WU, Published: Nicholas C ; c/o Ucla Office of Intellectual Property Ad ministration, 11000 Kinross Ave., Ste 200 MC 140607, — with international search report (Art. 21(3)) Los Angeles, California 90095-1406 (US). — before the expiration of the time limit for amending the (74) Agent: SUNDBY, Suzannah K.; Canady + Lortz LLP, claims and to be republished in the event of receipt of 1050 30th Street, NW, Washington, District of Columbia amendments (Rule 48.2(h)) 20007 (US). — with sequence listing part of description (Rule 5.2(a)) (81) Designated States (unless otherwise indicated, for every kind of national protection available): AE, AG, AL, AM,

00

- © (54) Title: VACCINES AND USES THEREOF (57) Abstract: Disclosed herein are methods, compositions, and vaccines formulations, kits, and reagents for developing a vaccine, ¾ e.g., a peptide-based vaccine. Also described herein are methods for identifying a peptide for use in developing a viral vaccine, such S as a viral vaccine against an influenza . VACCINES AND USES THEREOF

[0001] CROSS-REFERENCE TO RELATED APPLICATIONS [0002] This application claims the benefit of U.S. provisional patent application nos. 61/976,024, filed April 7, 2014; 62/073,717, filed October 31, 2014; 62/073,777, filed October 31, 2014; and 62/127,775, filed March 3, 2015; which are herein incorporated by reference in their entirety.

[0003] REFERENCE TO A SEQUENCE LISTFNG SUBMITTED VIA EFS-WEB

[0004] The content of the ASCII text file of the sequence listing named "20150406_034044_130WOl_seq" which is 92.1 kb in size was created on April 6, 2015, and electronically submitted via EFS-Web herewith the application is incorporated herein by reference in its entirety.

[0005] ACKNOWLEDGEMENT OF GOVERNMENT SUPPORT [0006] This invention was made with Government support under Grant No. AI028697 awarded by the National Institutes of Health. The Government has certain rights in the invention.

[0007] BACKGROUND [0008] Influenza A virus is the most common cause of flu in , leading to respiratory distress and systemic illness in all ages, and occasionally hospitalization or death in susceptible young or elderly. Roughly 300,000 to 500,000 deaths per year worldwide are caused by influenza virus. In the twentieth century three major pandemics occurred (1918, 1957, and 1968), causing an estimated 50 million, 500,000, and 2 million deaths, respectively. Influenza B virus can be another culprit that causes flu in humans and is included as part of the flu vaccine each year. [0009] Viral hepatitis can be a cause of considerable morbidity and mortality in the population, both from acute infection and chronic sequelae which can include chronic active hepatitis and cirrhosis caused by hepatitis B, C, and D. Sometimes, hepatitis such as hepatitis B further induces hepatocellular carcinoma, one of the ten most common cancers worldwide. [0010] The human immunodeficiency virus (HIV) is a lentivirus that can cause acquired immunodeficiency syndrome (AIDS). Roughly over 1 million people are infected with AIDS in the United States including about 14% whose infections have not been diagnosed. Each year, an estimated of 50,000 people in the United States are newly infected. [0011] Genetic research on influenza virus biology has been informed in large part by nucleotide variants present in clinical or laboratory and seasonal or pandemic samples, leaving a substantial part of the uncharacterized. Similarly, substantial parts of the hepatitis virus and HIV are also uncharacterized. Elucidation of these uncharacterized regions can improve vaccines to treat and prevent diseases.

[0012] SUMMARY OF THE INVENTION [0013] In some embodiments, a peptide is provided consisting of an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53. In some embodiments, a peptide is provided comprising a sequence with 100% sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-27, 29-38, and 40-53, wherein the peptide is at most 50 amino acids in length. In some embodiments, a peptide is provided comprising a sequence with at least 70% sequence identity to at least 15 contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NOs: 47-53, wherein the peptide is at most 50 amino acids in length. In some embodiments, a peptide is provided comprising a sequence with at least 70%> sequence identity to at least 8 contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-27, 32, and 40-46, wherein the peptide is at most 50 amino acids in length. In some embodiments, a peptide is provided comprising a sequence with at least 70%> sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-27, 32-38, and 40-53, wherein the peptide is at most 50 amino acids in length. In some embodiments, a peptide is provided comprising a sequence with at least 70%> sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-2, 5-15, 18-22, 26-32, and 39-46 wherein the peptide is less than

15 amino acids in length. In some embodiments, a peptide is provided comprising a sequence with at least 90%> sequence identity to an amino acid sequence of SEQ ID NO: 29 or 31, wherein the peptide is at most 50 amino acids in length. In some embodiments, a peptide is provided comprising at least 70%> sequence identity to the amino acid sequence of SEQ ID NO: 28, wherein the peptide is 8 amino acids in length. [0014] In some embodiments, any of the above peptides is attached to a lipid. In some embodiments, the lipid comprises a palmitoyl group. In some embodiments, any of the above peptides is attached to a CD4+ (helper) T cell epitope. In some embodiments, any of the above peptides is an isolated peptide. [0015] In some embodiments, a composition is provided comprising any of the above peptides, wherein the composition comprises a pharmaceutically acceptable carrier. In some embodiments, the composition further comprises an adjuvant. In some embodiments, the composition further comprises a second peptide, wherein the second peptide is not identical to any of the above peptides, wherein the second peptide is not more than 50 amino acids in length. In some embodiments, the second peptide is attached to a lipid. In some embodiments, the lipid comprises a palmitoyl group. In some embodiments, the second peptide is attached to a CD4+ (helper) T cell epitope. In some embodiments, the second peptide comprises at least 70% sequence identity to at least 8 contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53. In some embodiments, the second peptide comprises 100% sequence identity to at least 8 contiguous amino acids of an amino acid sequence selected from SEQ ID NOs: 1-53. In some embodiments, the second peptide comprises a sequence comprising 70%> sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53. In some embodiments, the second peptide comprises a sequence comprising 100%> sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53. In some embodiments, the second peptide consists of a sequence selected from SEQ ID NOs: 1-53. [0016] In some embodiments, the composition further comprises from 1 to 52 additional peptides comprising a sequence with at least 70%> sequence identity to at least 8 contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53, wherein each of the additional peptides is not identical to another peptide in the composition, and wherein each of the additional peptides is not more than 50 amino acids in length. In some embodiments, the composition further comprises from 1 to 52 additional peptides comprising a sequence with at least 100% sequence identity to at least 8 contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53, wherein each of the additional peptides is not identical to another peptide in the composition, and wherein each of the additional peptides is not more than 50 amino acids in length. In some embodiments, the composition further comprises from 1 to 52 additional peptides comprising a sequence with at least 70% sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53, wherein each of the additional peptides is not identical to another peptide in the composition, and wherein each of the additional peptides is not more than 50 amino acids in length. In some embodiments, the composition further comprises from 1 to 52 additional peptides comprising a sequence with at least 100% sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53, wherein each of the additional peptides is not identical to another peptide in the composition, and wherein each of the additional peptides is not more than 50 amino acids in length. In some embodiments, the composition further comprises from 1 to 52 additional peptides consisting of an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53, wherein each of the additional peptides is not identical to another peptide in the composition. [0017] In some embodiments, provided herein is a nucleic acid molecule encoding any of the above peptides. In some embodiments, provided herein is a composition comprising the nucleic acid molecule, wherein the composition comprises a pharmaceutically acceptable carrier. [0018] In some embodiments, provided herein is a protein that binds to an amino acid sequence with at least 70% sequence identity to at least 8 contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53. In some embodiments, provided herein is a protein that binds to an amino acid sequence with at least 70% sequence identity to an amino acid sequence selected from a group consisting of SEQ ID NOs: 1-53. In some embodiments, provided herein is a protein that binds to an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53. In some embodiments, the protein is an antibody or an antibody fragment. In some embodiments, provided herein is composition comprising the protein, wherein the composition comprises a pharmaceutically acceptable carrier. [0019] In some embodiments, any of the above compositions are formulated for subcutaneous, intramuscular, intranasal, or intradermal administration. [0020] In some embodiments, provided herein is a method comprising administering to a subject any of the above compositions. In some embodiments, the subject is a human. In some embodiments, the subject is infected with a virus. In some embodiments, the virus is an influenza virus. In some embodiments, the influenza virus is influenza A virus, influenza B virus, or influenza C virus. In some embodiments, the influenza virus is an influenza A virus. In some embodiments, the administering induces cross-protection against one or more subtypes of influenza strains in the subject. In some embodiments, the administering induces cross- protection against one or more subtypes of influenza A strains in the subject. In some embodiments, the administration is subcutaneous, intramuscular, intranasal, or intradermal. In some embodiments, the method is a vaccination method. [0021] In some embodiments, provided herein is a method for identifying a peptide sequence for use in developing a viral vaccine, the method comprising: a) obtaining a nucleic acid library comprising a mutation in a nucleic acid sequence relative to a nucleic acid sequence in a naturally-occurring virus; b) producing a set of wherein a viral genome of the set of viruses is generated using the nucleic acid library; c) comparing nucleic acid sequences from the nucleic acid library to nucleic acid sequences from the set of viruses; and d) based on c), identifying a peptide sequence for use in developing a vaccine. In some embodiments, the step c) further comprises comparing an abundance of the nucleic acid sequence comprising the mutation in the nucleic acid library to an abundance of the nucleic acid sequence comprising the mutation in the set of viruses. [0022] In some embodiments, provided herein is a method of identifying a peptide for use in developing a vaccine, the method comprising: a) obtaining a library of nucleic acid sequences, wherein the library of nucleic acid sequences comprises a plurality of mutations in the nucleic acid sequences relative to the nucleic acid sequences in a naturally-occurring virus or organism; b) identifying mutations in the library of nucleic acid sequences that reduce an ability of the virus or organism to propagate; c) based on b), assigning a value to an amino acid position of a protein encoded by a nucleic acid sequence in the library; and d) based on the value assigned in c), identifying a peptide for use in developing a vaccine. In some embodiments, the naturally- occurring virus is an influenza A virus. In some embodiments, the method further comprises performing next generation sequencing of nucleic acids in the nucleic acid library and the set of viruses. In some embodiments, the next generation sequencing comprises use of reversible dye terminator nucleotides. In some embodiments, step d) further comprises an HLA affinity binding analysis using a computer. In some embodiments, step d) further comprises a sequence conversation analysis using a computer. [0023] Both the foregoing general description and the following detailed description are exemplary and explanatory only and are intended to provide further explanation of the invention as claimed. The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute part of this specification, illustrate several embodiments of the invention, and together with the description serve to explain the principles of the invention.

[0024] BRIEF DESCRIPTION OF THE DRAWINGS [0025] This invention is further understood by reference to the drawings wherein: [0026] Fig. 1 shows a flowchart of a method described herein. [0027] Fig. 2 shows a summary of mutant library properties and description of roles of each viral segment. The data presented in the table format was based on Sanger sequencing of 4-20 clones/viral segment and mutation frequency was estimated. The counting of the bacterial colonies yields complexity. Mutation frequency x complexity results in coverage (average number of times each possible mutation was present in the library). [0028] Fig. 3 indicates coverage of mutations at each position in Pop 1. 20x means that 20 or more clones had a mutation at a given position; lOx is 10 or more; and 5x is 5 or more. [0029] Fig. 4 is the MOIs used for each round of infection. Pilot growth curves were conducted with several MOIs and several durations of infection for each segment at each round. 24 hours post-infection was chosen empirically as the optimal time point of harvesting for all infections. The lowest MOI that yielded detectable virus (>1 .4 x 103 TCID50/mL) at 24 hpi was selected for the full-scale infection of the next round. To maintain the coverage, absolute number of infectious virions used was at least 5x the complexity of the library, while number of target cells was adjusted to achieve the desired MOI. [0030] Figs. 5A-Fig. 5E illustrate an experimental flowchart and overall essentialness of base positions within segments. Fig. 5A depicts selection scheme and description of the four Populations subjected to deep sequencing. Fig. 5B shows a method for distinguishing true mutations from sequencing errors: After sonication and adaptor ligation (with barcode), a limited number of fragments were subjected to PCR. Each starting molecule, marked by start site, end site, orientation, and barcode, was therefore sequenced as about 10 identical copies, each on independent beads. Variants on 1 of 10 reads were errors, while those on 10 of 10 were true mutations. Fig. 5C indicates the percentage of base positions, by segment that was severely impaired (dark gray bars) or strongly attenuated (light gray bars) after deep sequencing. Fig. 5D illustrates a histogram showing quantities of base positions at intervals of Pop4/Popl ratio. Fig. 5E shows the Pop4/Popl ratio plotted against the degree of conservation of each gene (reported as sequence entropy). [0031] Fig. 6 shows the number of variants that scored within each range of -log(p-value). The p value is the percent likelihood that a variant read at a given position was due to random sequencing error, and was calculated with binomial exact test using a conservative estimation of sequencing error of 1%. The cutoff for classification as true mutation variants was set at p < 10 4. Sequencing errors generally had high p-values, >0. 1. Variants that fell in the uncertain middle were relatively few. [0032] Fig. 7 illustrates the total counts and percentages of each category of mutation, at the nucleotide level, by segment and total for the genome. No. is the number of base positions falling into the indicated category, and % is the percentage of base positions in each category out of the total number of base positions in the segment; s.i. is severely impaired; s.a. is strongly attenuated; m.a. is moderately attenuated; neut is neutral; and enh. is enhanced. Percentages were out of all covered positions (Popl > 10 occurrences). [0033] Fig. 8 shows the percentage of base positions that fell into each category by segment. Denominator was all positions with Popl > 10 occurrences. [0034] Fig. 9 indicates total counts and percentages of each category of mutation, at the amino acid level, by segment and total for the genome. [0035] Figs. 1OA-Fig. 10F illustrate validation of the next-generation sequencing results by reconstruction of individual mutants. Fig. 10A shows examples of severely impairing mutations that proved "lethal", i.e., no detectable infectious titer (TCID50) after C227 transfection. *=mutations that scored as attenuating after deep sequencing. Dotted line, detection limit of the titer assay. Figs. lOB-Fig. 10D show titers of additional severely impaired mutants after a single round (Fig. 10B) or two rounds of A549 infection (Fig. IOC), or a single short round of infection

(13h) (Fig. 10D). Fig. 10E is a competition assay. A549s were co-infected with a 1:1 volume of wild type:mutant from C227 supernatant. Ratio of mutant (light gray area of bar) to wild type (dark gray area) was determined by cloning and Sanger sequencing. HA-890 and NP-1242 were neutrals from the deep sequencing which served as controls, while the rest were severely impaired. Fig. 10F shows titers of representative neutral mutations after C227 transfection. [0036] Fig. 11 exemplifies expression of mutant compared to wild type plasmid. C227 cells were transfected with one plasmid, total RNA was harvested, reverse transcribed with random hexamers, and expression of the specific gene quantified by QPCR. Comparison was made to the corresponding wild type plasmid. Only PA-465 was more than 2-fold reduced compared to wild type. Error bars are mean ± s.d. [0037] Fig. 12 shows essentialness map of viral segment nucleotide sequences. Viral gene(s) denoted at left, base position below. Severely impaired, dark gray (-1) lines; strongly attenuated, (-1) light gray; moderately attenuated, (+1) dark gray; neutral, (+2) light gray. Uncovered positions (Popl < 10 occurrences) were left unmarked. [0038] Fig. 13 indicates distribution of all mutations by codon position for each Population. Ml and NS1 were considered without including overlapping reading frames for M2 and NEP/NS2, respectively. There is a relatively uniform distribution in Popl, and increasing bias toward the 3rd position from Pop2 to Pop4. [0039] Fig. 14 shows a distribution of severely impairing mutations by codon position in Popl. Given that severely impairing mutations had 0 Pop4 occurrences, data only for Popl was considered. Only the non-overlapping reading frames of M l and NS1 for segments 7 and 8 were considered. Six of the segments (excluding PB1 and NS1) showed bias against the 3rd position.

[0040] Fig. 15 illustrates the types of mutation by Population and by functional effect. Top table indicates all mutations that were considered and distribution of mutation type percentages within Populations. Bottom table shows severely impaired mutations in Popl, by type; and neutral mutations by type. Excluded were segments 2, 7, and 8 due to alternative reading frames within the predominant coding region. [0041] Fig. 16 shows fraction of conservative (BLOSUM62 > 0; light gray) vs. non- conservative (BLOSUM62 < 0; dark gray) missense mutations for each essentialness category, for segments 1-6. [0042] Figs. 17A-Fig. 17C indicate the correlation of essentialness with known structure- function. Color coding of the side chains is as shown in Fig. 12. Fig. 17A shows the nuclear localization signal of NP. Fig. 17B is the homeodomain-homo logy region of NS1. Fig. 17C shows the major NLS of PB2 (loop region comprising K752) in complex with importin-a. Dotted lines show major contacts. The hydrogen bonds with K152 (left panel) disrupted by the lethal mutation to e (right panel).

[0043] Fig. 18 shows the localization of 5 neutral residues on M2 structure. Crystal structure of M2 homotetramer. Left, side view; right, top-down view. Side chains (L40, 142, L43, D44, and R45) are neutral and lie in the lower end of the ion channel. [0044] Figs. 19A-Fig. 19H illustrate possible explanations and predictions for PA and HA structure-function based on essentialness data. Fig. 19A shows the PA C-terminus, illustrating a view as looking into the "dragon's mouth". Dark gray denotes an inhibitor PB1 N-term peptide (1-25) that is bound inside the mouth. 3 severely impaired or strongly attenuated residues predicted to interact with PB1 are labeled. Side chains labeled in "stick" view (PyMol). Fig. 19B shows the PA C-term, in which a cysteine on the al2 helix putatively stabilizes the a l 1 and al3 helices. Fig. 19C shows a different view of the PA C-term, in which two cysteines in proximity likely forms a disulfide bond. Fig. 19D indicates the PA N-term, in which 5 severely impaired or strongly attenuated residues reside within the amino acid region 163-178, previously implicated in regulation of polymerase activity via promoter binding. Fig. 19E illustrates a second view of the PA N-term, in which 11 strongly attenuated residues reside within the amino acid region 146-160, which is of unknown function. Fig. 19F shows the Stalk region of HA. Side chains are contact sites with antibody CR6261, colored by essentialness category. Fig. 19G depicts the globular head region of HA, portions of the beta strands show moderate attenuation (dark gray), while helices and loops/coils show neutral attenuation (light gray). Fig. 19H illustrates a coil in the HA stalk, adjacent to the stem loop, with high essentialness. Arrows denote contact sites with CR6261; curved arrow indicates Ab contact site behind the essential loop, which is in the foreview.

[0045] Fig. 20 indicates validation of 15 mutants lying within candidate vaccine stretches.

15 mutants were tested, TCID50 of transfected C227 supernatant. Two were >1.5 logs below wild type, two were >2.5 logs reduced, and eleven were undetectable. [0046] Fig. 2 1 illustrates conservation diagrams of seven vaccine candidate stretches and three randomly chosen stretches (con). All sequences deposited in HyperTextTransferProtocol://WorldWideWeb.ncbi.nlm.nihDOTgov/genomes/FLU/FLU.Hyper TextMarkupLanguage, wherein "HyperTextTransferProtocol" is "http", "Worldwide Web" is "www", "DOT" is and "HyperTextMarkupLanguage" is "html", were used as input. Height of letters corresponds to percent of all database sequence that were the indicated amino acid. Peptides listed in Fig. 1 are identified with the following sequence identifiers: VQNALNGNGD (Ml 80-89) (SEQ ID NO: 120); (T/L)G(N/S)QN(Q/H/N)(T/I)(E/G)(T/I)C (NA 40-49) (SEQ ID NO: 121); (S/E)SWSYIVE(T/K)(S/P) (HA 91-100) (SEQ ID NO: 122); CSCYPD(T/S/A)(G/S)(E/T/I)(I/V)(T/M)CVCRDNWHGS (NA 263-284) (SEQ ID NO: 123); LSNLAKGEKANVLIGQGDVVLVMKRK (PB2 713-739) (SEQ ID NO: 124); DT(L/I)CIGYHANNSTDTVDTVLEKNVTV (HA 18-43) (SEQ ID NO: 125); LE(D/S)VFAGKNTDLEALMEWLKT (Ml 29-48) (SEQ ID NO: 126); EGSYFFGDNAE (NP 357-372) (SEQ ID NO: 127); MVAYMLERELVRK (PB2 202-214) (SEQ ID NO: 128); and GEEMATKADY (PA 152-161) (SEQ ID NO: 129). [0047] Fig. 22A and Fig. 22B illustrate candidate peptides for vaccine formulation. Fig. 22A indicates peptides for flu vaccines. Essent. is the average score of missense changes of all residues in the peptide. Scoring rubric was: severely impairing, 4 : strongly attenuated, 3; moderately attenuated, 2, and neutral, 1. HLA allele is the supertype representative demonstrated or predicted to bind the peptide. Affinity is in nM. Classification is the demonstrated or predicted strength of binding. Fig. 22B indicates potential peptides for flu vaccines. Asterisk indicates the estimated essentialness score because these peptide includes several low coverage residues. Peptides listed in Fig. 22A and Fig. 22B are identified with the following sequence identifiers: VLIGQGDVVLV (SEQ ID NO: 130); YMLERELV (SEQ ID NO: 131); YMLERELVRK (SEQ ID NO: 132); VVLVMKRK (SEQ ID NO: 133); AYMLERELV (SEQ ID NO: 134); GQGDVVLVM (SEQ ID NO: 135); GAVAVLKY (SEQ ID NO: 62); YHANNSTDTV (SEQ ID NO: 136); NLAKGEKANVL (SEQ ID NO: 137); GEKANVLI (SEQ ID NO: 138); LSTRGVQI (SEQ ID NO: 32); NTDLEVLM (SEQ ID NO: 139); LTDSQTATK (SEQ ID NO: 140); STDTVDTI (SEQ ID NO: 141); ILTDSQTA (SEQ ID NO: 142); VLMEWLKT (SEQ ID NO: 143); ILTDSQTATK (SEQ ID NO: 2); EMATKADY (SEQ ID NO: 21); DTVDTILEKNV (SEQ ID NO: 144); GPDDGAVAVL (SEQ ID NO: 61); VQIASNENM (SEQ ID NO: 145); LEVLMEWL (SEQ ID NO: 146); and KNTDLEVLMEW (SEQ ID NO: 147). [0048] Fig. 23A-Fig. 23C illustrate a mini-replicon assay for confirmed lethal mutants of PB2, PA, and NP. A 3+1 system was used where 3 of the wild type PB2, PB1, PA, and/or NP were co-transfected along with the indicated mutant and a vLuciferase reporter construct. RNA from cells was extracted and RT-QPCR performed with primers recognizing cDNA form vLuc. Shown are relative values, compared to a negative control lacking the relevant segment. Fig. 23A is PB2; Fig. 23B is PA; Fig. 23C is NP. * is moderately attenuated; † is neutral in the selection-sequencing. [0049] Figs. 24A-Fig. 24G show an additional function for a PB2 mutant. Fig. 24A shows the infectious titer of C227 transfected supernatant of PB2-692 vs. wild type in the limiting dilution titer assay (a-e, h-i: wildtype, black bars; PB2-692, gray bars). Dotted line, detection limit of assay. Fig. 24B is the relative levels of vRNA, cRNA, and mRNA in an artificial mini- replicon/polymerase assay. Fig. 24C shows the relative luciferase activity in the mini-replicon assay. Fig. 24D indicates the relative numbers of viral genomic copies as measured by QPCR for NP in the supernatant of transfected C227 cells. Fig. 24E shows the relative numbers of intracellular viral genomic copies in infected A549 cells, as measured by QPCR for NP, at the indicated times after infection. Fig. 24F illustrates immunofluorescence for NP in A549 cells 12 hours after infection. Fig. 24G shows the quantification of all 8 viral segments in the supernatant of infected A549 cells 24 hours after a single round of infection, as measured by QPCR for each individual segment. In (Fig. 24E - Fig. 24G), all infections were normalized to copy number of input virus, typically with an approximate wild type virus MOI of 0.01 . Error bars, mean ± s.d. In Fig. 24G, error range was typically about 0.5 QPCR cycles, or about 1.4- fold. [0050] Fig. 25 shows immunogenicity of a predicted HLA-B7-restricted peptide. The y-axis indicates the percentage of Ag-specific CD8+ T cells of all IFN-y-secreting CD8+ T cells in the lungs of immunized mice. [0051] Fig. 26 shows the percentage of Ag-specific CD8+ T cells of all IFN-y-secreting CD8+ T cells in the lungs of mice immunized with adjuvanted influenza peptides. [0052] Fig. 27A and Fig. 27B show mutant library passaging and sequencing library preparation. Fig. 27A illustrates that the HA segment was randomized by error-prone PCR. The randomized segment with the remaining seven wild type segments were transfected into C227 cells to generate the viral mutant library. Two rounds of 24-hour infections were performed using A549 cells with an MOI of 0.05. Both the plasmid library and the passaged viral library were subjected to sequencing using the Illumina HiSeq 2000 machine. Fig. 27B illustrates that the HA gene was divided into 12 amplicons for the first PCR. Unique tags were assigned to both ends of the individual molecules during the amplification process. The second PCR generated identical copies of individual molecules linked with unique tags. Dark circles represent true mutations; light circles represent sequencing errors. [0053] Fig. 28A illustrates single-nucleotide resolution fitness profiling. Fig. 28A illustrates computation of the RF index (relative fitness) for individual point mutations across the HA gene. Logio of the RF index is plotted on the y-axis. Each nucleotide position is represented by four consecutive lines for the RF indices that correspond to mutating to A, T, C, or G. The Logio RF index of wild type (WT) nucleotides is set as zero. Only point mutations with a coverage of > 30 tag-conflated reads in the plasmid library are shown. Otherwise, point mutations are plotted as a gray circle on the zero baseline. A short region is shown as an inset to demonstrate the resolution of the dataset. Fig. 28B illustrates the distributions of the logio RF indices for silent substitutions, nonsense substitutions and missense substitutions are displayed as histograms. Mutations located at the 5' terminal 200 bp and 3' terminal 200 bp regions are not included in this analysis to avoid confounding by the vRNA packaging signal. [0054] Fig. 29A illustrates experimental validation of a method described herein. Fig. 29A: the top panel displays the logio TCID50 value of mutant virus rescued from transfection. The bottom panel represents their logio RF indices from the biological duplicate. Fig. 29B: A Pearson correlation of 0.9 is obtained between logio TCID50 from transfection (x-axis) and logio RF index (y-axis). [0055] Fig. 30 shows a bar chart that represents the RF indices of all profiled amino acid substitutions at heptad position d. RF indices of silent mutations are also included for comparison (see also Fig. 4D of Wu et al. Scientific Reports, 2014). Fig. 3 1 exemplifies a comparison with phenotype reported in the literature. [0056] Fig. 32A and 32B illustrate sequencing coverage and depth. The sequencing depth is displayed in terms of number of "error- free" reads, and shows the number of "error- free" reads that cover a particular nucleotide or mutation. Fig. 32A illustrates sequencing depths across the HA gene in plasmid mutant library (DNA), replicate 1 of passaged viral mutant library

(Replicate 1) and replicate 2 of passaged viral mutant library (Replicate 2) are shown. Fig. 32B illustrates the distribution of "error-free" read counts of individual point mutations in the plasmid mutant library. [0057] Fig. 33 shows the distribution of conflated cluster size. Reads from the same amplicon with the same tag was defined as a cluster. The counts (number of reads) for all clusters are displayed as a histogram. Individual molecules, each carrying a unique tag, have an average copy number of 10 in the sequencing data, thus validating the sequencing library preparation design. [0058] Fig. 34 shows comparison with BLOSUM62-based amino acid conservation. RF index of missense mutations from different segments were extracted and compared to amino acid conservation. The degree of amino acid conservation was quantified by the BLOSUM62 matrix, a substitution matrix based on an implicit model of evolution. The x-axis represents the different cutoffs for BLOSUM62 values. The average RF index value for missense mutations that satisfied the cutoff was plotted against different BLOSUM62 cutoff values. The positive correlation between the RF index and the degree of amino acid conservation of missense mutations indicates that the fitness data shows consistency with the evolutionary trend for missense mutations. [0059] Figs. 35A-35D illustrate the RF index of substitutions at different functional sites. Fig. 35A : E339 and R416 on the NP protein form a salt bridge at the homodimer interface, which can be essential for viral replication. This suggests that it can be a drug target. Several small molecules have been identified to target this interface and inhibit viral replication. Fig. 35B: T271A has been identified as a replication enhancement substitution on PB2. T271A virus showed enhanced growth as compared to the WT strain in mammalian cells in vitro. Fig. 35C NA 259Y (Nl naming: H274Y), which can be an oseltamivir drug resistance substitution, was shown to present a strongly attenuated phenotype in WSN. In contrast, H259N (Nl naming: H274N), did not impose a deleterious effect in the fitness profiling data. This substitution is hypothesized to reduce influenza zanamivir sensitivity. Fig. 35D: L26I, L26F, V27A, and S3 IN on M2, the amantadine/rimantadine resistance substitutions, were shown to impose little effect on viral replication. The data is consistent with the observation that resistance substitutions emerged rapidly during amantadine/rimantadine drug treatment. Dotted line represents the average RF index for missense mutation at the indicated segment. Overall, the fitness data was consistent with the phenotypes of functional mutants reported in the literature. [0060] Fig. 36 shows experimental validation of severely attenuated and neutral mutations. Mutations that displayed an RF index of < 0.05 were classified as severely attenuated and > 0.4 were classified as neutral. Individual mutants were constructed and compared to the wild type (WT) replication phenotype. Post-transfection titers were plotted for lethal and viable mutants. Infection was initiated at an MOI of 0.05. Virus was harvested at 24 hours post infection. For the validated mutations with an RF index < 0.05, 68% have at least 1 log decrease in titer compared to WT. For the validated mutations with an RF index > 0.4, 94% have a titer within a 2-fold change as compared to WT. Overall the validation rate is about 80%.

[0061] DETAILED DESCRIPTION

[0062] 1. OVERVIEW [0063] Disclosed herein are methods, compositions (e.g., vaccines formulations), kits, and reagents for developing a vaccine, e.g., a peptide-based vaccine. The peptide-based vaccine can be a peptide-based vaccine for treating or inhibiting infection by a virus, e.g., influenza virus, hepatitis virus, human immunodeficiency virus, and the like, in a subject. As used herein, the terms "individual(s)", "subject(s)", and "patient(s)" are used interchangeably and can include any , e.g., a mammal or bird. As used herein, a "mammal" can include any mammal, including a human. None of the terms require or are limited to situations characterized by the constant or intermittent supervision and/or care of a health care worker (e.g., a doctor, a nurse, a nurse practitioner, a physician's assistant, an orderly, a hospice worker, or the like).

[0064] Described herein are methods, compositions, and kits for generating a universal vaccine. A universal vaccine can be a vaccine that offers broad-based protection against multiple strains of a pathogen (e.g., multiple strains of influenza A virus), against multiple pathogens (e.g., influenza A virus, hepatitis C virus, and human immunodeficiency virus), and/or multiple pathogens within a family of pathogens (e.g., influenza A virus, influenza B virus, and influenza C virus; or Hepatitis A, B, C, D, E, and F viruses). In some embodiments, the methods described herein can be used to identify peptide regions or epitopes based on invariance. Invariance, as used herein, can describe the functional importance of an amino acid residue in the context of the fitness of a pathogen. Invariance can be a measurement of the fitness of a pathogen. At an amino acid residue level, invariance can be associated with how tolerant an amino acid residue is to a mutation and how adverse this mutation is to the ability of the pathogen to propagate, that is, fitness of the pathogen. Invariance can be correlated with the role of an amino acid residue in a pathogen's survival. For example, a mutation in a pathogen that exerts a deleterious effect on the proliferation of the pathogen can be considered a destructive mutation and would not be propagated within a pathogen population. An associated amino acid position correlated with the deleterious mutation can be characterized as invariant as its mutation would not be tolerated. [0065] In some embodiments, the methods described herein involve further selection of an invariant peptide based on its interaction with a human leukocyte antigen (HLA) group. HLAs, also termed as major histocompatibility complexes (MHC) in non-human vertebrates, can be cell surface antigen presenting proteins which can present foreign peptides to effector lymphocytes. Sometimes, HLAs can be further categorized based on the origin of the antigens that they present, and can generally be labeled according to the MHC class I, class II or class III format. HLAs that correspond to MHC class I molecules can present intracellular antigens that are produced from digested foreign proteins to killer T-cells, also known as CD8+ cytotoxic T cells or cytotoxic T-lymphocyte (CTL), which can then destroy the cells. HLAs that correspond to MHC class II molecules can present extracellular antigens to CD4 + (helper) T cells. CD4 + lymphocytes can be immune response mediators that can play a role in establishing and modulating adaptive immune response. HLAs that correspond to MHC class III can encode components from the complement system, or from the innate immune system. In some embodiments, an invariant peptide can be further selected based on its ability to interact with HLAs that correspond to MHC class I, HLAs that correspond MHC class II, or HLAs that correspond to MHC class III. [0066] In some cases, invariance of a sequence is correlated with sequence consensus. Sequence consensus can refer to either sequence homology or sequence identity. In some cases, invariance of a sequence is not correlated to sequence consensus. In some cases, invariance of a sequence is not correlated with sequence homology. In some cases, invariance of a sequence is not correlated with sequence identity. Sequence identity can refer to two or more sequences that have an identical residue at a position. Sequence homology can refer to two or more sequences that have a residue with similar charge at a position. For example, the amino acid residues can be divided into polar (e.g., Gly, Ser, Thr, Cys, Tyr, Asn, and Gin), nonpolar (e.g., Ala, Val, Leu, e, Pro, Phe, Trp, and Met), acidic (e.g., Asp, and Glu), and basic (e.g., Lys, Arg, and His) groups. Therefore, two or more sequences characterized as homologous sequences can contain residues selected from, for example, the polar amino acid group but may not have identical sequences at a position.

[0067] In some embodiments, the steps in generating vaccines described herein are illustrated in Fig. 1. A nucleic acid library containing mutations in multiple or every nucleic acid base position in a gene or genome of a pathogen can be generated from a pathogen source (101), e.g., a virus. This approach can allow simulation of all possible mutations that can occur in a particular pathogen strain. Sometimes the mutations are randomly introduced, e.g., by stochastic methods such as random mutagenesis. Mutations can be introduced at specific positions, e.g., by a non-stochastic method such as site-specific mutagenesis. The nucleic acids from the library can then be introduced into cells to generate a set of viruses (102). A virus can be assembled in a cell-free extract. Mutations at nucleic acid bases which may be lethal to the virus can be predicted to drop out of the pool during infection rounds. Nucleic acid bases containing mutations that can be neutral to viral propagation may not affect the virus and may be present in the final pool at a frequency close to the original frequency in the nucleic acid library. Many nucleic acid bases can lie in-between, e.g., a mutation that is moderate to viral propagation can cause relative attenuation but not absolute lethality to the virus. The nucleic acids obtained from the viruses can be sequenced, e.g., using a next-generation sequencing (NGS) method (103) (e.g., Illumina/Solexa sequencing using reversible dye terminator nucleotides), and then compared with the sequences from the nucleic acid library (104). [0068] A value can be assigned at an amino acid residue location within a protein to indicate how deleterious a mutation can be with respect to the pathogen fitness. Regions that can exhibit multiple invariant amino acid residues, e.g., a mutation in the region that is deleterious to the pathogen fitness, can then be selected for HLA interaction analysis, and/or sequence conservation analysis among different strains of the pathogen. Additional in vitro and in vivo studies can be performed to identify one or more peptides for use in generating vaccine preparation. Further, one or more candidate peptides can be used for vaccine formulation and development (105) and subsequent administration to a patient as part of either a therapeutic treatment regimen or as a prophylactic measure (106). [0069] Exemplified herein are peptides from influenza A virus proteins. In some embodiments, the influenza A protein can be PB1, PB1-F2, PB2, PA, HA, NP, NA, Ml, M2, NS1, or NEP/NS2. Provided herein are peptides that comprise, consist essentially of, or consist of an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-68 (see Table 1 and Table 2), or fragments thereof of at least 8 amino acids. Also provided herein are peptides that comprise a sequence with at least 50, 70, 80, or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from the group consisting of SEQ ID NOs: 1-68 (see Table 1 and Table 2). Also provided herein are peptides that comprise a sequence with at least 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-68 (see Table 1 and Table 2), or fragments thereof of at least 8 amino acids. In some embodiments, the peptides can be less than 50, 40, 30, 20, or 10 amino acids in length. Also provided herein are compositions (e.g., vaccines) and kits comprising one or more of the peptides, e.g., about, or at least, 5, 10, 20, 30, 40, 50, or 53 unique peptides. [0070] In some embodiments, the present invention provides nucleic acids that encode a peptide with a sequence from an influenza A protein. In some embodiments, the nucleic acid can encode a peptide that comprises, consists essentially of, or consists of an amino acid sequence from the group consisting of SEQ ID NOs: 1-68, or a fragment thereof of at least 8 amino acids. In some embodiments, the nucleic acid can encode a peptide that comprises a sequence with at least 50, 70, 80, or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from the group consisting of SEQ ID NOs: 1-68 (see Table 1 and Table 2), or a fragment thereof of at least 8 amino acids. In some embodiments, the nucleic acid can encode a peptide that comprises a sequence with at least 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or

100% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-68 (see Table 1 and Table 2), or a fragment thereof of at least 8 amino acids. In some embodiments, the encoded peptides can be less than 50, 40, 30, 20, or 10 amino acids in length. Also provided herein are compositions (e.g., vaccines) and kits comprising one or more of the nucleic acid molecules. [0071] In some embodiments, the present invention provides proteins (e.g., antibody or antibody fragments) that recognize or bind to a peptide from an influenza A protein. In some embodiments, the protein (e.g., an antibody or antibody fragment) can bind a sequence that comprises, consists essentially of, or consists of an amino acid sequence from the group consisting of SEQ ID NOs: 1-68, or a fragment thereof of at least 8 amino acids. In some embodiments, the protein (e.g., an antibody or antibody fragment) can bind a sequence with at least 50, 70, 80, or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from the group consisting of SEQ ID NOs: 1-68 (see Table 1 and Table 2). In some embodiments, the protein (e.g., an antibody or antibody fragment) can bind a sequence with at least 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-68 (see Table 1 and Table 2), or a fragment thereof of at least 8 amino acids. Also provided herein are compositions (e.g., vaccines) and kits comprising one or more of the proteins (e.g., antibody or antibody fragments).

[0072] 2. PATHOGEN TARGETS [0073] In addition to the pathogen targets exemplified herein, any target, pathogen, or component thereof can be used to prepare a composition, e.g., vaccine, using one or more of the methods disclosed herein. Exemplary components include proteins comprising native sequences, peptides comprising natural or unnatural amino acids and/or with modifications such as glycosylation, palmitoylation, myristoylation, and the like, and nucleic acids comprising natural or unnatural bases. In some embodiments, a pathogen can be any , virus, or that causes infection in a mammal. In some embodiments, a pathogen can be a virus. The virus can be prepared from any target, pathogen, or component thereof. [0074] In some embodiments, the virus can be a DNA virus or an RNA virus. The DNA virus can be a single-stranded (ss) DNA virus, a double-stranded (ds) DNA virus, or a DNA virus that contains both ss and ds DNA regions. The RNA virus can be a single-stranded (ss) RNA virus or a double-stranded (ds) RNA virus. The ssRNA virus can further be classified into a positive-sense RNA virus or a negative-sense RNA virus. [0075] The dsDNA virus can be from the family: , , , , , , , Rudiviridae, , , , Asfaviridae, , , , Corticoviridae, , , , , , , , Nimaviridae, Pandoraviridae, , , , , , , Sphaerolipoviridae, and Tectiviridae. [0076] The ssDNA virus can be from the family: , Bacillariodnaviridae, Bidnaviridae, , , Inoviridae, , , , and . [0077] The DNA virus that contains both ss and ds DNA regions can be from the group of pleolipoviruses. In some cases, the pleolipoviruses include Haloarcula hispanica pleomorphic virus 1, Halogeometricum pleomorphic virus 1, Halorubrum pleomorphic virus 1, Halorubrum pleomorphic virus 2, Halorubrum pleomorphic virus 3, and Halorubrum pleomorphic virus 6. [0078] The dsRNA virus can be from the family: , , Cystoviridae, , Hypoviridae, Megavirnaviridae, , Picobirnaviridae, , , and . [0079] The positive-sense ssRNA virus can be from the family: , , , , Astroviridae, , , , , , , , , , , , Leviviridae, Luteoviridae, , , Narnaviridae, , , Picornaviridae, , Roniviridae, , Togaviridae, , , and . [0080] The negative-sense ssRNA virus can be from the family: , , , , , Arenaviridae, Bunyaviridae, Ophioviridae, and . [0081] Exemplary viruses that can be used in the methods described herein include, but are not limited to: Abelson leukemia virus, Abelson murine leukemia virus, Abelson's virus, Acute laryngotracheobronchitis virus, Adelaide River virus, Adeno associated virus group, Adenovirus, African horse sickness virus, African swine fever virus, AIDS virus, Aleutian mink disease parvovirus, Alpharetrovirus, , ALV related virus, Amapari virus, , , Arbovirus, Arbovirus C, arbovirus group A, arbovirus group B, group, Argentine hemorrhagic fever virus, Argentine hemorrhagic fever virus, Arterivirus, , Ateline herpesvirus group, Aujezky's disease virus, Aura virus, Ausduk disease virus, Australian bat , Aviadenovirus, avian erythroblastosis virus, avian infectious bronchitis virus, avian leukemia virus, avian leukosis virus, avian lymphomatosis virus, avian myeloblastosis virus, avian paramyxovirus, avian pneumoencephalitis virus, avian reticuloendotheliosis virus, avian sarcoma virus, avian type C group, , , B virus, B19 virus, Babanki virus, baboon herpesvirus, baculovirus, Barmah Forest virus, Bebaru virus, Berrimah virus, Betaretrovirus, Birnavirus, Bittner virus, BK virus, Black Creek Canal virus, bluetongue virus, Bolivian hemorrhagic fever virus, Boma disease virus, border disease of sheep virus, borna virus, bovine alphaherpesvirus 1, bovine alphaherpesvirus 2, bovine coronavirus, bovine ephemeral fever virus, bovine immunodeficiency virus, bovine leukemia virus, bovine leukosis virus, bovine mammillitis virus, bovine papillomavirus, bovine papular stomatitis virus, bovine parvovirus, bovine syncytial virus, bovine type C oncovirus, bovine viral diarrhea virus, Buggy Creek virus, bullet shaped virus group, Bunyamwera virus supergroup, Bunyavirus, Burkitt's lymphoma virus, Bwamba Fever, CA virus, Calicivirus, California encephalitis virus, camelpox virus, canarypox virus, canid herpesvirus, canine coronavirus, canine distemper virus, canine herpesvirus, canine minute virus, canine parvovirus, Cano Delgadito virus, caprine arthritis virus, caprine encephalitis virus, Caprine Herpes Virus, Capripox virus, , caviid herpesvirus 1, Cercopithecid herpesvirus 1, cercopithecine herpesvirus 1, Cercopithecine herpesvirus 2, Chandipura virus, Changuinola virus, channel catfish virus, Charleville virus, chickenpox virus, Chikungunya virus, chimpanzee herpesvirus, chub reovirus, chum salmon virus, Cocal virus, Coho salmon reovirus, coital exanthema virus, Colorado tick fever virus, , Columbia SK virus, common cold virus, contagious eethyma virus, contagious pustular dermatitis virus, Coronavirus, Corriparta virus, coryza virus, cowpox virus, coxsackie virus, CPV (cytoplasmic polyhedrosis virus), cricket paralysis virus, Crimean-Congo hemorrhagic fever virus, croup associated virus, Cryptovirus, , , cytomegalovirus group, cytoplasmic polyhedrosis virus, deer papillomavirus, deltaretrovirus, dengue virus, Densovirus, Dependovirus, Dhori virus, diploma virus, Drosophila C virus, duck hepatitis B virus, duck hepatitis virus 1, duck hepatitis virus 2, duovirus, Duvenhage virus, Deformed wing virus DWV, eastern equine encephalitis virus, eastern equine encephalomyelitis virus, EB virus, Ebola virus, Ebola-like virus, echo virus, echovirus, echovirus 10, echovirus 28, echovirus 9, ectromelia virus, EEE virus, EIA virus, EIA virus, encephalitis virus, encephalomyocarditis group virus, encephalomyocarditis virus, , enzyme elevating virus, enzyme elevating virus (LDH), epidemic hemorrhagic fever virus, epizootic hemorrhagic disease virus, Epstein-Barr virus, equid alphaherpesvirus 1, equid alphaherpesvirus 4, equid herpesvirus 2, equine abortion virus, equine arteritis virus, equine encephalosis virus, equine infectious anemia virus, equine morbiUivirus, equine rhinopneumonitis virus, equine , Eubenangu virus, European elk papillomavirus, European swine fever virus, Everglades virus,

Eyach virus, felid herpesvirus 1, feline calicivirus, feline fibrosarcoma virus, feline herpesvirus, feline immunodeficiency virus, feline infectious peritonitis virus, feline leukemia/sarcoma virus, feline leukemia virus, feline panleukopenia virus, feline parvovirus, feline sarcoma virus, feline syncytial virus, Filovirus, Flanders virus, , foot and mouth disease virus, Fort Morgan virus, Four Corners hantavirus, fowl adenovirus 1, fowlpox virus, Friend virus, Gammaretrovirus, GB hepatitis virus, GB virus, German measles virus, Getah virus, gibbon ape leukemia virus, glandular fever virus, goatpox virus, golden shinner virus, Gonometa virus, goose parvovirus, granulosis virus, Gross' virus, ground squirrel hepatitis B virus, group A arbovirus, Guanarito virus, guinea pig cytomegalovirus, guinea pig type C virus, Hantaan virus, Hantavirus, hard clam reovirus, hare fibroma virus, HCMV (human cytomegalovirus), hemadsorption virus 2, hemagglutinating virus of Japan, hemorrhagic fever virus, hendra virus, Henipaviruses, Hepadnavirus, hepatitis A virus, hepatitis B virus group, hepatitis C virus, hepatitis D virus, hepatitis delta virus, hepatitis E virus, hepatitis F virus, hepatitis G virus, hepatitis nonA nonB virus, hepatitis virus, hepatitis virus (nonhuman), hepatoencephalomyelitis reovirus 3, Hepatovirus, heron hepatitis B virus, herpes B virus, , herpes simplex virus 1, herpes simplex virus 2, herpesvirus, herpesvirus 7, Herpesvirus ateles, Herpesvirus hominis, Herpesvirus infection, Herpesvirus saimiri, Herpesvirus suis, Herpesvirus varicellae, Highlands J virus, Hirame rhabdovirus, hog cholera virus, human adenovirus 2, human alphaherpesvirus 1, human alphaherpesvirus 2, human alphaherpesvirus 3, human B lymphotropic virus, human betaherpesvirus 5, human coronavirus, human cytomegalovirus group, human foamy virus, human gammaherpesvirus 4, human gammaherpesvirus 6, human hepatitis A virus, human herpesvirus 1 group, human herpesvirus 2 group, human herpesvirus 3 group, human herpesvirus 4 group, human herpesvirus 6, human herpesvirus 8, human immodeficiency virus, human immodeficiency virus 1, human immunodeficiency virus 2, human papillomavirus, human T cell leukemia virus, human T cell leukemia virus I, human T cell leukemia virus II, human T cell leukemia virus III, human T cell lymphoma virus I, human

T cell lymphoma virus II, human T cell lymphotropic virus type 1, human T cell lymphotropic virus type 2, human T lymphotropic virus I, human T lymphotropic virus II, human T lymphotropic virus III, , infantile gastroenteritis virus, infectious bovine rhinotracheitis virus, infectious haematopoietic necrosis virus, infectious pancreatic necrosis virus, influenza virus A, influenza virus B, influenza virus C, influenza virus D, influenza virus pr8, insect iridescent virus, insect virus, , Japanese B virus, Japanese encephalitis virus, JC virus, Junin virus, Kaposi's sarcoma-associated herpesvirus, Kemerovo virus, Kilham's rat virus, Klamath virus, Kolongo virus, Korean hemorrhagic fever virus, kumba virus, Kysanur forest disease virus, Kyzylagach virus, La Crosse virus, lactic dehydrogenase elevating virus, lactic dehydrogenase virus, Lagos bat virus, Langur virus, lapine parvovirus, Lassa fever virus, Lassa virus, latent rat virus, LCM virus, Leaky virus, Lentivirus, , leukemia virus, leukovirus, lumpy skin disease virus, lymphadenopathy associated virus, , lymphocytic choriomeningitis virus, lymphoproliferative virus group, Machupo virus, mad itch virus, mammalian type B oncovirus group, mammalian type B , mammalian type C retrovirus group, mammalian type D retroviruses, mammary tumor virus, Mapuera virus, Marburg virus, Marburg-like virus, Mason Pfizer monkey virus, , Mayaro virus, ME virus, measles virus, Menangle virus, Mengo virus, Mengovirus, Middelburg virus, milkers nodule virus, mink enteritis virus, minute virus of mice, MLV related virus, MM virus, Mokola virus, Molluscipoxvirus, Molluscum contagiosum virus, monkey B virus, monkeypox virus, , , Mount Elgon bat virus, mouse cytomegalovirus, mouse encephalomyelitis virus, mouse hepatitis virus, mouse K virus, mouse leukemia virus, mouse mammary tumor virus, mouse minute virus, mouse pneumonia virus, mouse poliomyelitis virus, mouse polyomavirus, mouse sarcoma virus, mousepox virus, Mozambique virus, Mucambo

virus, mucosal disease virus, mumps virus, murid betaherpesvirus 1, murid cytomegalovirus 2, murine cytomegalovirus group, murine encephalomyelitis virus, murine hepatitis virus, murine leukemia virus, murine nodule inducing virus, murine polyomavirus, murine sarcoma virus, , Murray Valley encephalitis virus, myxoma virus, Myxovirus, Myxovirus multiforme, Myxovirus parotitidis, Nairobi sheep disease virus, Nairovirus, Nanirnavirus, Nariva virus, Ndumo virus, Neethling virus, Nelson Bay virus, neurotropic virus, New World Arenavirus, newborn pneumonitis virus, Newcastle disease virus, Nipah virus, noncytopathogenic virus, Norwalk virus, nuclear polyhedrosis virus (NPV), nipple neck virus, O'nyong'nyong virus, Ockelbo virus, oncogenic virus, oncogenic viruslike particle, oncornavirus, , Orf virus, Oropouche virus, , Orthomyxovirus, , , Orungo, ovine papillomavirus, ovine catarrhal fever virus, owl monkey herpesvirus, Palyam virus, Papillomavirus, Papillomavirus sylvilagi, Papovavirus,

parainfluenza virus, parainfluenza virus type 1, parainfluenza virus type 2, parainfluenza virus type 3, parainfluenza virus type 4, Paramyxovirus, , paravaccinia virus, Parvovirus, , parvovirus group, Pestivirus, Phlebovirus, phocine distemper virus, Picodnavirus, , pig cytomegalovirus-pigeonpox virus, Piry virus, Pixuna virus, pneumonia virus of mice, Pneumovirus, poliomyelitis virus, poliovirus, , polyhedral virus, polyoma virus, Polyomavirus, Polyomavirus bovis, Polyomavirus cercopitheci,

Polyomavirus hominis 2, Polyomavirus maccacae 1, Polyomavirus muris 1, Polyomavirus muris

2, Polyomavirus papionis 1, Polyomavirus papionis 2, Polyomavirus sylvilagi, Pongine herpesvirus 1, porcine epidemic diarrhea virus, porcine hemagglutinating encephalomyelitis virus, porcine parvovirus, porcine transmissible gastroenteritis virus, porcine type C virus, pox virus, poxvirus, poxvirus variolae, Prospect Hill virus, Provirus, pseudocowpox virus, pseudorabies virus, psittacinepox virus, quailpox virus, rabbit fibroma virus, rabbit kidney vacuolating virus, rabbit papillomavirus, rabies virus, raccoon parvovirus, raccoonpox virus, Ranikhet virus, rat cytomegalovirus, rat parvovirus, rat virus, Rauscher's virus, recombinant

vaccinia virus, recombinant virus, reovirus, reovirus 1, reovirus 2, reovirus 3, reptilian type C virus, respiratory infection virus, respiratory syncytial virus, respiratory virus, reticuloendotheliosis virus, Rhabdovirus, Rhabdovirus carpia, , Rhinovirus, , Rift Valley fever virus, Riley's virus, rinderpest virus, RNA tumor virus, Ross River virus, Rotavirus, rougeole virus, Rous sarcoma virus, , rubeola virus, Rubivirus, Russian autumn encephalitis virus, SA 11 simian virus, SA2 virus, Sabia virus,

Sagiyama virus, Saimirine herpesvirus 1, salivary gland virus, sandfly fever virus group, Sandjimba virus, SARS virus, SDAV (sialodacryoadenitis virus), sealpox virus, Semliki Forest Virus, Seoul virus, sheeppox virus, Shope fibroma virus, Shope papilloma virus, simian foamy virus, simian hepatitis A virus, simian human immunodeficiency virus, simian immunodeficiency virus, simian parainfluenza virus, simian T cell lymphotrophic virus, simian virus, simian virus 40, , Sin Nombre virus, Sindbis virus, smallpox virus, South American hemorrhagic fever viruses, sparrowpox virus, Spumavirus, squirrel fibroma virus, squirrel monkey retrovirus, SSV 1 virus group, STLV (simian T lymphotropic virus) type I, STLV (simian T lymphotropic virus) type II, STLV (simian T lymphotropic virus) type III, stomatitis papulosa virus, submaxillary virus, suid alphaherpesvirus 1, suid herpesvirus 2, , swamp fever virus, swinepox virus, Swiss mouse leukemia virus, TAC virus, Tacaribe complex virus, Tacaribe virus, Tanapox virus, Taterapox virus, Tench reovirus, Theiler's encephalomyelitis virus, Theiler's virus, Thogoto virus, Thottapalayam virus, Tick borne encephalitis virus, Tioman virus, Togavirus, , tumor virus, Tupaia virus, turkey rhinotracheitis virus, turkeypox virus, type C retroviruses, type D oncovirus, type D retrovirus group, ulcerative disease rhabdovirus, Una virus, Uukuniemi virus group, vaccinia virus, vacuolating virus, varicella zoster virus, , Varicola virus, variola major virus, variola virus, Vasin Gishu disease virus, VEE virus, Venezuelan equine encephalitis virus, Venezuelan equine encephalomyelitis virus, Venezuelan hemorrhagic fever virus, vesicular stomatitis virus, Vesiculovirus, Vilyuisk virus, viper retrovirus, viral haemorrhagic septicemia virus, Visna Maedi virus, Visna virus, volepox virus, VSV (vesicular stomatitis virus), Wallal virus, Warrego virus, wart virus, WEE virus, West Nile virus, western equine encephalitis virus, western equine encephalomyelitis virus, Whataroa virus, Winter Vomiting Virus, woodchuck hepatitis B virus, woolly monkey sarcoma virus, wound tumor virus, WRSV virus, Yaba monkey tumor virus, Yaba virus, , yellow fever virus, and the Yug Bogdanovac virus. [0082] A composition, e.g., vaccine, generated using one or more of the methods disclosed herein can be prepared from a virus, a bacterium, a fungus, or a component thereof. The composition, e.g., vaccine, generated using one or more of the methods disclosed herein can be prepared from a virus, such as a DNA virus, an RNA virus, or a component thereof. In some embodiments, the composition, e.g., vaccine, generated using one or more of the methods disclosed herein can be prepared from a DNA virus. In some embodiments, the DNA virus can be a hepatitis virus. In some embodiments, the composition, e.g., vaccine, generated using one or more of the methods disclosed herein can be prepared from an RNA virus. In some embodiments, the RNA virus can be an influenza virus, human immunodeficiency virus, or a hepatitis virus. [0083] In some embodiments, a composition, e.g., vaccine, generated using one or more of the methods disclosed herein can offer broad-based protection across multiple strains of a virus, a bacterium, or a fungus. In some embodiments, a composition, e.g., vaccine, generated using one or more of the methods disclosed herein can offer broad-based protection across multiple strains of a virus, such as a DNA virus or an RNA virus. In some embodiments, a composition, e.g., vaccine, generated using one or more of the methods disclosed herein can offer broad-based protection across multiple strains of a DNA virus. In some embodiments, the DNA virus can be a hepatitis virus. In some embodiments, a composition, e.g., vaccine, generated using one or more of the methods disclosed herein can offer broad-based protection across multiple strains of an RNA virus. In some embodiments, the RNA virus can be an influenza virus, human immunodeficiency virus, or a hepatitis virus.

[0084] 2a. Influenza [0085] In some embodiments, the methods and compositions described herein can target an influenza virus. As used herein, the term "target" can refer to any interaction with a virus or organism that can achieve attenuation of the virus or organism to decrease or remove the virulence of the virus, or to achieve inactivation and/or clearance of the virus or organism. For example, a method that targets a virus can include administering a peptide to a subject that elicits an immune response in the subject against a virus and a composition that targets the virus includes compositions that elicit a subject's immune response against the virus when administered to the subject. [0086] An influenza virus can be an RNA virus belonging to the family Orthomyxoviridae. Influenza viruses can include influenza A virus, influenza B virus, and influenza C virus. Influenza A virus can infect human, birds, pigs, horses, seals, and other . Birds can be a natural host. Influenza B and C types can infect humans or other animals. Influenza A virus can cause influenza, such as seasonal influenza or pandemic influenza. Influenza B virus and/or influenza C virus can cause influenza, such as seasonal influenza or pandemic influenza. [0087] In some embodiments, the methods and compositions described herein can target an influenza virus evolved through antigenic drift and/or antigenic shift. Antigenic shift can refer to the process when two or more different strains of a virus or two or more strains of different viruses combine to form a new subtype having a mixture of gene segments of the parent viruses. Antigenic drift, or genetic drift, can refer to mutation over time. Influenza A virus can evolve through both antigenic shift and antigenic drift. Influenza B virus and influenza C virus can evolve through antigenic drift. [0088] In some embodiments, the methods and compositions described herein can target an influenza virus subtype. Influenza A virus can be subtyped based on hemagglutinin (HA) and neuraminidase (N), two proteins expressed on the surface of the viral envelope. Hemagglutinin is a protein that can induce red blood cells to agglutinate. Neuraminidase is an enzyme that can cleave glycosidic bonds of the monosaccharide. Influenza A virus can display about 18 HA subtypes: HI, H2, H3, H4, H5, H6, H7, H8, H9, H10, Hl l , H12, H13, H14, H15, H16, H17, and H18; and about eleven subtypes: Nl, N2, N3, N4, N5, N6, N7, N8, N9, N10, andNl l . Together, the HA and N subtypes can be combined in any combination. Non-limiting examples of the HA and N subtype combinations that have been observed include: H1N1, H1N2, H1N7, H2N2, H3N2, H3N8, H4N8, H5N1, H5N2, H5N8, H5N9, H6N5, H7N1, H7N2, H7N3, H7N4, H7N7, H7N9, H8N4, H9N2, H10N7, HI 1N6, H12N5, H13N6, and H14N5. In some embodiments, the vaccines described herein can target an influenza A virus that has a combination of the HA and N subtypes disclosed herein. In some cases, the combination can be represented by HxNy, wherein x represents any HI-HI 8 subtypes, and y represents any Nl-Nl 1 subtypes. For example, in some embodiments, vaccines disclosed herein can target a subtype represented as HINy, which is HI in combination with any N subtype described herein, or a subtype represented as H2Ny, and the like. In some embodiment s, a vaccine described herein can target an influenza A virus that has the HA and N subtype combinations H1N1, H1N2, H1N7, H2N2, H3N2, H3N8, H4N8, H5N1, H5N2, H5N8, H5N9, H6N5, H7N1, H7N2, H7N3, H7N4, H7N7, H7N9, H8N4, H9N2, H10N7, HI 1N6, H12N5, H13N6, or H14N5. [0089] In some embodiments, a composition, e.g., vaccine, disclosed herein can target an influenza A virus strain, such as a strain that is characterized as a seasonal influenza strain. A seasonal influenza strain can have characteristics that induce low pathogenicity toward mammals, e.g., humans, or it can have characteristics that induce high pathogenicity toward mammals, e.g., humans. For example, the H5Ny subtypes can be highly pathogenic and have been observed to circulate within the human population since 2003. [0090] In some embodiments, a composition, e.g., a vaccine disclosed herein can target a virus in one or more clades within each of the HA subtypes. For example, within the H5 subtype, the strains can be categorized as: clade 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. The HA sequences of H5N1 viruses circulating in avian species since 2003 can be separate into 2 distinct phylogenetic clades. Clade 1 viruses can circulate in Cambodia, Thailand, and Vietnam and can cause human infections. Clade 2 viruses can circulate in birds in China and Indonesia. They can be found in the Middle East, Europe, and Africa. Multiple subclades of clade 2 can distinguished; e.g., subclades 1, 2, and 3 can differ in geographical distribution and can be responsible for human cases. Within clade 2.3 there can be four divisions including 2.3.2 and 2.3.4. In some embodiments, a composition, e.g., a vaccine, described herein can target viruses from one or more clades within an HA subtype. In some embodiments, a composition, e.g., vaccine, described herein can target viruses from clade 0, 1, 2, 3, 4, 5, 6, 7, 8, and/or 9 within the H5 subtype. In some embodiments, a composition, e.g., vaccine, described herein can target one or more viruses from subclades of a clade 0, 1, 2, 3, 4, 5, 6, 7, 8, and/or 9 within the H5 subtype.

[0091] In some embodiments, a composition, e.g., vaccine, described herein can target a virus from Clade 1, Clade 2.1, Clade 2.2, and/or Clade 2.3. Clade 1 viruses can cause outbreaks in birds in Thailand and Vietnam and human infections in Thailand. Clade 2.1 viruses can circulate in poultry and cause human infections in Indonesia. Clade 2.2 viruses can cause outbreaks in birds in some countries in Africa, Asia, and Europe, and can be associated with human infections in Egypt, Iraq and Nigeria. Clade 2.3 viruses can be isolated in Asia and can be responsible for human infections in China and the Lao People's Democratic Republic.

[0092] In some cases, a composition, e.g., vaccine, described herein can target a virus from an emerging clade. An Emerging clade can include A/goose/Guiyang/3 37/2006 (clade 4) and A/chicken/Shanxi/2/2006 (clade 7). These clades can infect poultry in Asia. In some embodiments, a composition, e.g., vaccine, described herein can target a virus represented by A/goose/Guiyang/337/2006 (clade 4) and/or A/chicken/Shanxi/2/2006 (clade 7).

[0093] A representative strain from each clade include: clade 1: A/HongKong/2 13/03; clade 2 : A/Indonesia/5/05; clade 3 : A/Chicken/Hong Kong/SF2 19/01; clade 4 : A/chicken/Guiyang/44 1/2006; clade 5 : A/duck/Guangxi/1 681/2004; clade 6 : A/tree sparrow/Henan/4/2004; clade 7 : A/chicken/Shanxi/2/2006; clade 8 : A/Chicken/Henan/12/2004; clade 9 : A/duck/Guangxi/2775/2005; and clade 0 : A/Hong Kong/156/97. In some embodiments, a vaccine described herein can target viruses represented by clade 1: A/HongKong/2 13/03; clade 2 : A/Indonesia/5/05; clade 3 : A/Chicken/Hong Kong/SF2 19/01; clade 4 : A/chicken/Guiyang/44 1/2006; clade 5 : A/duck/Guangxi/1 681/2004; clade 6 : A/tree sparrow/Henan/4/2004; clade 7 : A/chicken/Shanxi/2/2006; clade 8 : A/Chicken/Henan/12/2004; clade 9 : A/duck/Guangxi/2775/2005; and/or clade 0 : A/Hong Kong/156/97.

[0094] In some embodiments, a composition, e.g., vaccine, described herein can target a strain from each of the above disclosed clade, or a strain that has a about 40%, 50%, 60%>, 70%>,

80% , 90%o, 95% , or more sequence homology to each of the representative strains from each clade. In some embodiments, a composition, e.g., vaccine, described herein can target a strain from each of the above disclosed clades, or a strain that has about 40%>, 50%>, 60%>, 70%>, 80%>,

90% , 95%o, or less sequence homology to each of the representative strains from each clade.

[0095] In some cases, a composition, e.g., vaccine, disclosed herein can target an influenza A virus strain that is characterized as a pandemic influenza strain. Pandemic influenza strains can be characterized as containing a new hemagglutinin compared to the hemagglutinins that have been circulating in the human population, or a hemagglutinin that has not been evident in the human population for over a decade, so that the general population is immunologically naive to the strain's hemagglutinin. Pandemic influenza strains have been associated with HINy, H2Ny, and H3Ny, such as the 1918 Spanish Flu that exhibited the HlNl subtype, the Asian Flu from 1957-1958 which exhibited the H2N2 subtype, the Hong Kong Flu from 1968-1969 that has the H3N2 subtype, the Russian Flu from 1977-1978 which has the HlNl subtype, and the 2009 Flu containing the HlNl subtype. Circulations of the H5, H7, and H9 subtypes in wild birds can also contribute to generating a pandemic strain. In some cases, a pandemic influenza strain can also be associated with H5Ny, H7Ny, and H9Ny. [0096] In some embodiments, a composition, e.g., vaccine, described herein can target an influenza B virus. Influenza B viruses can be classified into lineages and strains. An influenza B virus can belong to either the B/Yamagata or the B/Victoria lineage. Exemplary influenza B virus strains include Brisbane/60/2008, Massachusetts/2/2012, and Wisconsin/1/2010. [0097] In some embodiments, a composition, e.g., vaccine, described herein can target an influenza B virus strain derived from the B/Yamagata and/or the B/Victoria lineage. In some embodiments, a composition, e.g., vaccine, described herein can target a strain including Brisbane/60/2008, Massachusetts/2/2012, and Wisconsin/1/2010. In some embodiments, a composition, e.g., vaccine, described herein can target strains from both the influenza A virus and influenza B virus. [0098] In some embodiments, a composition, e.g., vaccine, described herein can target an influenza A virus, influenza B virus, and/or an influenza C virus. In some embodiments, a composition, e.g., vaccine, described herein can target strains of influenza A virus, influenza B virus, influenza C virus, or a combination thereof. [0099] In some embodiments, a composition, e.g., vaccine, described herein can be used to treat a patient who has an influenza infection, such as an influenza A virus infection, an influenza B virus infection, or an influenza C virus infection. Sometimes, a composition, e.g., vaccine, described herein can be used as a vaccination method against the infection of influenza A virus, influenza B virus, or influenza C virus. Sometimes, a composition, e.g., vaccine, described herein offers cross-protection against the different strains associated with the influenza A virus, the influenza B virus, and/or the influenza C virus. [0100] In some embodiments, a composition, e.g., vaccine, described herein can comprise a peptide that is identified as an invariant region in an influenza protein. In some embodiments, the influenza virus may be an influenza A virus or an influenza B virus. In some embodiments, the influenza A virus can comprise about 10 genes on 8 separate RNA molecules which can encode about 11 proteins: PBl, PB1-F2, PB2, PA, HA, NP, NA, Ml, M2, NS1, and NEP/NS2. PBl, e.g., Genbank Accession No. CY034138.1 (Influenza A/WSN/ 1933(H IN 1)), can be an RNA-dependent RNA polymerase. PB1-F2, can bind mitochondria leading to a release of cytochrome c and induction of apoptosis in CD8 T-cells and alveolar macrophages. PB2, e.g., Genbank Accession No. CY034139. 1 (Influenza A/WSN/1933(H1N1)) can be part of the RNA- dependent RNA polymerase complex. PA, e.g., Genbank Accession No. CY034137.1 (Influenza A/WSN/1933(H1N1)) can be involved in viral transcription and replication. HA, e.g., Genbank Accession No. CY034132. 1 (Influenza A/WSN/1933(H1N1)), can be a protein on a surface of an influenza virus and can play a role in fusion of the viral envelope. NP, e.g., Genbank Accession No. CY034135. 1 (Influenza A/WSN/1933(H1N1)) can coat the viral RNA to form viral ribonucleoprotein complex. NA, e.g., Genbank Accession No. CY034134.1 (Influenza A/WSN/1933(H1N1)), can be on the surface of a virus and enable a virus to be released from a host cell. M l can form an intermediate core of the virion and tether NP to the viral ribonucleoprotein complex. M2 can contain ion channel activity. M l and M2 can be found at Genbank Accession No. CY034133.1 (Influenza A/WSN/1933(H1N1)). NS1 can inhibit cellular antiviral Type 1 interferon response. NEP/NS2 can be involved in nuclear export of the viral ribonucleoprotein complex. Sequence for NS1 and NEP/NS2 can be found at GenBank Accession No. CY034136.1 (Influenza A/WSN/1933(H1N1)). In some embodiments, one or more peptides in a composition, e.g., vaccine, described herein can be derived from PBl, PB1-F2, PB2, PA, HA, NP, NA, Ml, M2, NS1, and/or NEP/NS2. Sometimes, one or more peptides used in a composition, e.g., vaccine, described herein can be derived from PBl, PB2, PA, HA, NP, NA, and/or Segment 7 (which can be an RNA segment that encodes M l and M2). Sometimes, one or more peptides with sequences derived from PBl, PB2, PA, HA, NP, NA, and/or Segment 7 can comprise, consist essentially of, or consist of a sequence selected from SEQ ID NOs: 1-53 (Table 1). Sometimes, one or more peptides can comprise, consist essentially of, or consist of a sequence selected from SEQ ID NOs: 54-68 (Table 2). Sometimes, one or more peptides comprise, consist essentially of, or consist of a sequence selected from SEQ ID NOs: 1-68. [0101] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of a sequence selected from the group consisting of SEQ ID NOs: 1-53, or a fragment thereof of at least 8 contiguous amino acids. [0102] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of a sequence selected from the group consisting of SEQ ID NOs: 1-68, or a portion thereof of at least 8 contiguous amino acids. [0103] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of a sequence with at most 40%,

50% , 60% , 70% , 80%o, 90%o, or 95% sequence homology to a sequence selected from the group consisting of SEQ ID NOs: 1-53, or a portion thereof of at least 8 contiguous amino acids.

[0104] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of a sequence with at most 40%>,

50%o, 60%o, 70%o, 80%o, 90%o, or 95% sequence homology to a sequence selected from the group consisting of SEQ ID NOs: 1-68, or a portion thereof of at least 8 contiguous amino acids.

[0105] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of a sequence with at least 40%>,

50%o, 60%o, 70%o, 80%o, 90%o, or 95% sequence homology to a sequence selected from the group consisting of SEQ ID NOs: 1-53, or a portion thereof of at least 8 contiguous amino acids.

[0106] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of a sequence with at least 40%>,

50% , 60% , 70% , 80% , 90% , or 95% sequence homology to a sequence selected from the group consisting of SEQ ID NOs: 1-68, or a portion thereof of at least 8 contiguous amino acids.

[0107] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of at most 40%, 50%, 60%, 70%,

80% , 90% , or 95%o sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-53, or a portion thereof of at least 8 contiguous amino acids.

[0108] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of at most 40%, 50%, 60%, 70%,

80% , 90% , or 95%o sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-68, or a portion thereof of at least 8 contiguous amino acids.

[0109] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of a sequence with at least 40%,

50% , 60% , 70% , 80% , 90% , or 95% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-53, or a portion thereof of at least 8 contiguous amino acids.

[0110] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of a sequence with at least 40%,

50% , 60% , 70% , 80% , 90% , or 95% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-68, or a portion thereof of at least 8 contiguous amino acids.

[0111] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of a sequence length at most 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95% the sequence length of a sequence selected from the group consisting of SEQ ID NOs: 1-53, wherein the one or more peptides comprise at least 50%,

60% , 70% , 80%o, or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from the group consisting of SEQ ID NOs: 1-53.

[0112] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of a sequence length at most 30%>, 40%, 50%, 60%, 70%, 80%, 90%, or 95% the sequence length of a sequence selected from the group consisting of SEQ ID NOs: 1-68, wherein the one or more peptides comprise at least 50%>,

60% , 70% , 80% , or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from the group consisting of SEQ ID NOs: 1-68.

[0113] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of a sequence length at least 30%>, 40%, 50%, 60%, 70%, 80%, 90%, or 95% the sequence length of a sequence selected from the group consisting of SEQ ID NOs: 1-53, wherein the one or more peptides comprise at least 50%>,

60% , 70% , 80% , or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from the group consisting of SEQ ID NOs: 1-53.

[0114] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of a sequence length at least 30%>, 40%, 50%, 60%, 70%, 80%, 90%, or 95% the sequence length of a sequence selected from the group consisting of SEQ ID NOs: 1-68, wherein the one or more peptides comprise at least 50%>,

60% , 70% , 80% , or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from the group consisting of SEQ ID NOs: 1-68.

[0115] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides with at most 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 contiguous amino acids of a sequence selected from the group consisting of SEQ ID NOs: 1-53.

[0116] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides with at most 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 contiguous amino acids of a sequence selected from the group consisting of SEQ ID NOs: 1-68.

[0117] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides with at least 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 contiguous amino acids of a sequence from the group consisting of SEQ ID NOs: 1-53.

[0118] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides with at least 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 contiguous amino acids of a sequence from the group consisting of SEQ ID NOs: 1-68.

[0119] In some embodiments, a composition, e.g., a vaccine, comprises a first peptide comprising, consisting essentially of, or consisting of a sequence selected from SEQ ID NOs: 1- 68; and a second peptide comprising at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of the first peptide, wherein the first peptide and the second peptide do not have identical sequences. [0120] In some embodiments, a composition, e.g., a vaccine, comprises a first peptide comprising, consisting essentially of, or consisting of a sequence selected from SEQ ID NOs: 1- 53; and a second peptide comprising at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of the first peptide, wherein the first peptide and the second peptide do not have identical sequences.

[0121] In some embodiments, a composition, e.g., a vaccine, comprises a first peptide comprising, consisting essentially of, or consisting of a sequence selected from SEQ ID NOs: 1- 3, 5, 7, 9, 10, 12, 14, 16, 18, 19, 21, 23, 26, 28, 30, 32, 33, 38-40, 42, 45, 47, and 51; and a second peptide comprising at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of the first peptide, wherein the first peptide and the second peptide are not identical. [0122] In some embodiments, a composition, e.g., a vaccine, comprises a first peptide comprising, consisting essentially of, or consisting of a sequence selected from SEQ ID NOs: 1- 68; and a second peptide comprising at most 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of the first peptide, wherein the first peptide and the second peptide do not have identical sequences. [0123] In some embodiments, a composition, e.g., a vaccine, comprises a first peptide comprising, consisting essentially of, or consisting of a sequence selected from SEQ ID NOs: 1- 53; and a second peptide comprising at most 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of the first peptide, wherein the first peptide and the second peptide do not have identical sequences. [0124] In some embodiments, a composition, e.g., a vaccine, comprises a first peptide comprising, consisting essentially of, or consisting of a sequence selected from SEQ ID NOs: 1- 3, 5, 7, 9, 10, 12, 14, 16, 18, 19, 21, 23, 26, 28, 30, 32, 33, 38-40, 42, 45, 47, and 51; and a second peptide comprising at most 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of the first peptide, wherein the first peptide and the second peptide do not have identical sequences. [0125] In some embodiments, a composition, e.g., a vaccine, comprises a first peptide comprising, consisting essentially of, or consisting of a sequence selected from SEQ ID NOs: 1- 68; and at least an additional peptide comprising, consisting essentially of, or consisting of a sequence selected from SEQ ID NOs: 1-68, wherein the first peptide and the additional peptide do not have identical sequences. [0126] In some embodiments, a composition, e.g., a vaccine, comprises a first peptide comprising, consisting essentially of, or consisting of a sequence selected from SEQ ID NOs: 1- 53; and at least an additional peptide comprising, consisting essentially of, or consisting of a sequence selected from SEQ ID NOs: 1-53, wherein the first peptide and the additional peptide do not have identical sequences.

[0127] In some embodiments, a composition, e.g., a vaccine, comprises a first peptide comprising, consisting essentially of, or consisting of sequence with at least 30%, 40%, 50%>, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from SEQ ID NOs: 1-68; and a second peptide comprising, consisting essentially of, or consisting of a sequence with at least 30%>, 40%>, 50%>, 60%>, 70%>,

80% , 90% , 95% , 99% , or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from SEQ ID NOs: 1-68, wherein the first peptide and second peptide do not have identical sequences.

[0128] In some embodiments, a composition, e.g., a vaccine, comprises a first peptide comprising, consisting essentially of, or consisting of a sequence with at least 30%>, 40%>, 50%>, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from SEQ ID NOs: 1-53; and a second peptide comprising, consisting essentially of, or consisting of a sequence with at least 30%>, 40%>, 50%>, 60%>, 70%>,

80% , 90% , 95%o, 99%o, or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from SEQ ID NOs: 1-53, wherein the first peptide and second peptide do not have identical sequences.

[0129] In some embodiments, a composition, e.g., a vaccine, comprises a first peptide comprising, consisting essentially of, or consisting of a sequence with at most 30%>, 40%>, 50%>, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from SEQ ID NOs: 1-68; and a second peptide comprising, consisting essentially of, or consisting of a sequence with at most 30%>, 40%>, 50%>, 60%>, 70%>,

80% , 90% , 95%o, 99%o, or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from SEQ ID NOs: 1-68, wherein the first peptide and second peptide do not have identical sequences.

[0130] In some embodiments, a composition, e.g., a vaccine, comprises a first peptide comprising, consisting essentially of, or consisting of a sequence with at most 30%>, 40%>, 50%>, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from SEQ ID NOs: 1-53; and a second peptide comprising, consisting essentially of, or consisting of a sequence with at most 30%>, 40%>, 50%>, 60%>, 70%>,

80% , 90% , 95%o, 99%o, or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from SEQ ID NOs: 1-53, wherein the first peptide and second peptide do not have identical sequences. [0131] In some embodiments, a composition, e.g., a vaccine, described herein comprises a second peptide comprising, consisting essentially of, or consisting of a sequence with at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to a sequence selected from SEQ ID NOs: 1-53. In some embodiments, the vaccine comprises a second peptide comprising, consisting essentially of, or consisting of a sequence with at most 30%>, 40%>, 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to a sequence selected from SEQ ID NOs: 1-53. Sometimes, the second peptide consists of a sequence selected from SEQ ID NOs: 1-53. [0132] In some embodiments, a composition, e.g., a vaccine, described herein further comprises from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,

51, or 52 additional peptides comprising a sequence with at least 30%>, 40%>, 50%>, 60%>, 70%>,

80% , 90% , 95% , 99% , or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from SEQ ID NOs: 1-68.

[0133] In some embodiments, a composition, e.g., a vaccine, further comprises from 1, 2, 3,

4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, or 52 additional peptides comprising a sequence with at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from SEQ

ID NOs: 1-53. Each of the peptides can be no more than about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100 amino acids in length. [0134] In some embodiments, a composition, e.g., vaccine, described herein further comprises from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,

51, or 52 additional peptides comprising a sequence with at most 30%>, 40%>, 50%>, 60%>, 70%>,

80% , 90% , 95% , 99% , or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from SEQ ID NOs: 1-68. [0135] In some embodiments, a composition, e.g., vaccine, described herein further comprises from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, to 5 1 additional peptides comprising a sequence with at most 30%>, 40%>, 50%>, 60%>, 70%>, 80%>,

90% , 95% , 99% , or 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from SEQ ID NOs: 1-53. Each of the additional peptides can be no more than about 5,

6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100 amino acids in length. [0136] In some embodiments, a composition, e.g., vaccine, described herein further comprises from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,

51, or 52 additional peptides comprising a sequence with at least 30%, 40%, 50%>, 60%>, 70%>, 80%, 90%, 95%, 99%, or 100% sequence identity to a sequence selected from SEQ ID NOs: 1- 68. [0137] In some embodiments, a composition, e.g., vaccine, described herein further comprises from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, to 5 1 additional peptides comprising a sequence with at least 30%>, 40%>, 50%>, 60%>, 70%>, 80%>, 90%, 95%, 99%, or 100% sequence identity to a sequence selected from SEQ ID NOs: 1-53.

Each of the additional peptides can be no more than about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100 amino acids in length. [0138] In some embodiments, a composition, e.g., a vaccine, described herein further comprises from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,

51, or 52 additional peptides comprising a sequence with at most 30%>, 40%>, 50%>, 60%>, 70%>,

80%, 90%, 95%, 99%, or 100% sequence identity to a sequence selected from SEQ ID NOs: 1- 68. [0139] In some embodiments, a composition, e.g., vaccine, described herein further comprises from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, to 5 1 additional peptides comprising a sequence with at most 30%>, 40%>, 50%>, 60%>, 70%>, 80%>, 90%, 95%, 99%, or 100% sequence identity to a sequence selected from SEQ ID NOs: 1-53.

Each of the additional peptides can be no more than about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100 amino acids in length. [0140] In some embodiments, composition, e.g., a vaccine, described herein further comprises from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, or 52 additional peptides consisting of a sequence selected from SEQ ID NOs: 1-68. [0141] In some embodiments, a composition, e.g., vaccine, described herein further comprises from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, or 52 additional peptides consisting of a sequence selected from SEQ ID NOs: 1-53. [0142] In some embodiments, a composition, e.g., vaccine, described herein comprises a peptide comprising at least 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of an amino acid sequence from the group consisting of SEQ ID NOs: 1-68, wherein the peptide is at most 50 amino acids in length. In some embodiments, the vaccine comprises a second peptide comprising at least 50%>, 60%>, 70%>, 80%>,

90% , 95% , 99% , or 100% sequence identity to at least 8 contiguous amino acids of an amino acid sequence from the group consisting of SEQ ID NOs: 1-68, wherein the peptide is at most 50 amino acids in length. [0143] In some embodiments, a composition, e.g., vaccine, described herein comprises a peptide comprising at least 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of an amino acid sequence from the group consisting of SEQ ID NOs: 1-53, wherein the peptide is at most 50 amino acids in length. [0144] In some embodiments, a composition, e.g., vaccine, described herein comprises an peptide comprising at least 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of an amino acid sequence from the group consisting of SEQ ID NOs: 1-27, 32, and 40-46, wherein the peptide is at most 50 amino acids in length. [0145] In some embodiments, a composition, e.g., vaccine, described herein comprises a peptide comprising at least 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 15 contiguous amino acids of an amino acid sequence from the group consisting of SEQ ID NOs: 47-53, wherein the peptide is at most 50 amino acids in length. [0146] In some embodiments, a composition, e.g., vaccine, described herein comprises a peptide comprising at most 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of an amino acid sequence from the group consisting of SEQ ID NOs: 1-68, wherein the peptide is at most 50 amino acids in length. [0147] In some embodiments, a composition, e.g., vaccine, described herein comprises a peptide comprising at most 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of an amino acid sequence from the group consisting of SEQ ID NOs: 1-53, wherein the peptide is at most 50 amino acids in length. [0148] In some embodiments, a composition, e.g., vaccine, described herein comprises a peptide comprising at most 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity to at least 8 contiguous amino acids of an amino acid sequence from the group consisting of SEQ ID NOs: 1-27, 32, and 40-46, wherein the peptide is at most 50 amino acids in length. [0149] In some embodiments, a composition, e.g., vaccine, described herein comprises a peptide comprising at most 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity

to at least 15 contiguous amino acids of an amino acid sequence from the group consisting of SEQ ID NOs: 47-53, wherein the peptide is at most 50 amino acids in length. [0150] In some embodiments, a composition, e.g., vaccine, described herein comprises a peptide comprising 100% sequence identity to the full length of an amino acid sequence from the group consisting of SEQ ID NOs: 1-27, 29-38, and 40-53, wherein the peptide is at most 50 amino acids in length. [0151] In some embodiments, a composition, e.g., vaccine, described herein comprises a peptide consisting 100% sequence identity to the full length of an amino acid sequence from the group consisting of SEQ ID NOs: 1-68. In some embodiments, a composition, e.g., vaccine, described herein comprises a peptide consisting 100% sequence identity to the full length of an amino acid sequence from the group consisting of SEQ ID NOs: 1-53. In some embodiments, a composition, e.g., vaccine, described herein comprises a peptide comprising a sequence with at

least 70% sequence identity to the amino acid sequence selected from the group consisting of

SEQ ID NOs: 1-2, 5-15, 18-22, 26-32, or 39-46 wherein the peptide is less than 15 amino acids in length. In some embodiments, a composition, e.g., vaccine, described herein comprises a

peptide comprising a sequence with at least 90%> sequence identity to SEQ ID NO: 29 or 31, wherein the peptide is at most 50 amino acids in length. In some embodiments, a composition,

e.g., vaccine, described herein comprises a peptide comprising at least 70%> sequence identity to the amino acid sequence of SEQ ID NO: 28, wherein the peptide is 8 amino acids in length. [0152] In some embodiments, a composition, e.g., vaccine, comprises a first peptide described herein and a second peptide described herein, and the first peptide and second peptide are attached to a CD4+ (helper) T cell epitope. In some cases, the second peptide comprises at

least 70% sequence identity to at least 8 contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53. In some cases, the second peptide

comprises 100% sequence identity to at least 8 contiguous amino acids of a sequence selected from SEQ ID NOs: 1-53. In some cases, the second peptide comprises a sequence comprising

70% sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53. In some cases, the second peptide comprises a sequence comprising 100% sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53. In some cases, the second peptide consists of a sequence selected from SEQ ID NOs: 1-53. [0153] In some embodiments, a composition, e.g., vaccine, comprises from 1 to 53 peptides comprising a sequence with at least 70% sequence identity to at least 8 contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53, wherein each of the peptides is not identical to any other peptide in the composition, and wherein each of the additional peptides is not more than 50 amino acids in length. [0154] In some embodiments, a composition, e.g., vaccine, comprises from 1 to 53 peptides comprising a sequence with 100% sequence identity to at least 8 contiguous amino acids of the sequences selected from SEQ ID NOs: 1-53, wherein each of the peptides is not identical to any other peptide in the composition, and wherein each of the peptides is not more than 50 amino acids in length. [0155] In some embodiments, a composition, e.g., vaccine, comprises from 1 to 53 peptides comprising a sequence with at least 70%> sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53, wherein each of the peptides is not identical to any other peptide in the composition, and wherein each of the additional peptides is not more than 50 amino acids in length. [0156] In some embodiments, a composition, e.g., vaccine, comprises from 1 to 53 peptides comprising a sequence with at least 100% sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53, wherein each of the peptides is not identical to any other peptide in the composition, and wherein each of the peptides is not more than 50 amino acids in length. [0157] In some embodiments, a composition, e.g., vaccine, comprises from 1 to 53 peptides consisting of a sequence selected from SEQ ID NOs: 1-53, wherein each of the peptides is not identical to any other peptide in the composition. [0158] In some cases, a composition comprises a peptide comprising about 2 to about 10, about 2 to about 20, about 2 to about 30, about 2 to about 40, or about 2 to about 53 of the sequences of SEQ ID NOs: 1-53 fused into one linked sequence. In some embodiments, a nucleic acid is provided that encodes the linked sequence. [0159] In any of the compositions described herein, one or more peptides can have no more than about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100 amino acids in length. One or more peptides can be 8-50, 4-40, 8-30, 8-20, 8-16, 8-15, 8-14, 8-13, 8-12, 8-1 1, 8-9, or 8-10 amino acids in length. [0160] In some embodiments, a composition, e.g., vaccine, comprises two or more peptides with sequences based on sequences from the same protein. For example, a composition, e.g., vaccine, can comprise two or more peptides with sequences based on sequences from PB2, e.g., 2, 3, or 4 peptides comprising, consisting essentially of, or consisting of a sequence selected from SEQ ID NOs: 1, 2, 3, and 4. In some embodiments, a composition, e.g., a vaccine comprises two or more peptides with sequences based on sequences from PBl, e.g., 2, 3, 4, 5, 6,

7, 8, 9, 10, 11, 12, 13, 14, 15, or 16 peptides comprising, consisting essentially of, or consisting of a sequence selected from SEQ ID NOs: 5-20. In some embodiments, a composition, e.g., vaccine, comprises two or more peptides with sequences based on sequences from PA, e.g., 2 peptides comprising, consisting essentially of, or consisting of sequences selected from SEQ ID NOs: 2 1 and 22. In some embodiments, a composition, e.g., vaccine, comprises two or more

peptides with sequences based on sequences from NP, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or 13 peptides with sequences selected from SEQ ID NOs: 26-38. In some embodiments, a composition, e.g., vaccine, comprises two or more peptides with sequence based on sequences

from NA, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 peptides with sequences selected from SEQ ID NOs: 39-50. In some embodiments, a composition, e.g., vaccine, comprises two or more peptides with sequence based on sequence of a protein from Segment 7, e.g., 2, or 3 peptides comprising, consisting essentially of, or consisting of sequence selected from SEQ ID NOs: 51- 53. [0161] In some embodiments, a composition, e.g., vaccine, comprises one or more peptides comprising, consisting essentially of, or consisting of a sequence selected from SEQ ID NOs: 1- 53 and one or more peptides with sequence from one or more strains of influenza A virus, one or more strains of influenza B virus, one or more strains of influenza C virus, or combinations thereof. Any of the compositions described herein can comprise one or more peptides with sequence from an influenza B virus, influenza C virus, or combinations thereof. [0162] In some embodiments, a peptide provided herein is an HA protein, fragment or epitope thereof having an amino acid sequence as set forth in accession number: ACF54598, influenza A virus A/WSN/1933(H1N1) with one or more amino acid substitutions or deletions as set forth in Table SI of Wu et al., "High-throughput profiling of influenza A virus hemagglutinin gene at single-nucleotide resolution" Scientific Report 4 :4942, DOI: 10.1038/srep04942 which is herein incorporated by reference in its entirety. In some embodiments the HA protein, fragment or epitope thereof has an amino acid sequence that is about 75% to about 99% identical to ACF54598, about 80% to about 99% identical to ACF54598, about 85% to about 99% identical to ACF54598, or about 90% to about 99% identical to ACF54598. In some embodiments, the one or more amino acid substitutions or deletions are selected from those having an RF index of 0.1 or more. In some embodiments, the one or more amino acid substitutions or deletions are selected from those having an RF index of 0.2 or more. In some embodiments, the one or more amino acid substitutions or deletions are selected from those having an RF index of 0.3 or more. [0163] In some embodiments, a composition, e.g., vaccine, described herein comprises a peptide described in the NIH Short Read Archive under the accession number: SRP033450 and the peptide Formulas disclosed herein (see Example 1), in which one or more wild type (WT) residues at one or more positions of the peptide can be replaced by a mutant amino acid residue.

Sometimes, the sequence of the peptide can be have about 50%, 60%, 70%>, 75%, or more sequence identity to its respective WT sequence.

[0164] Sometimes, a composition, e.g., vaccine, can comprise 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, or more peptides, in which each of the peptides is described in the NIH Short Read Archive under the accession number: SRP033450 and by a peptide Formulas disclosed herein, and each of the peptides can have one or more WT residues at one or more positions replaced by a residue illustrated in the peptide Formulas disclosed herein. [0165] Sometimes, a mutation at each amino acid residue can be evaluated based on the relative fitness (RF) index as described herein (see, e.g., Example VII and Example VIII). Sometimes, an RF index of above 0.1, 0.2, 0.3, 0.4, 0.5, or higher can indicate that the mutation can be tolerated. Sometimes, a mutant residue that has an RF index of above 0.1, 0.2, 0.3, 0.4, 0.5, or higher can be chosen as a substitution for its respective WT residue in a peptide sequence. Sometimes, a mutant peptide can be generated based on an RF index at each amino acid residue position. [0166] Sometimes, a DNA count can be used in conjunction with data on RF indices in order to select a sequence for a peptide for use in a composition described herein. The DNA count can indicate the number of times a nucleic acid sequence encoding a particular residue has been observed in the experiment. Sometimes, the DNA count of about 20, 50, or 100 for a nucleic acid sequence encoding a particular residue can be used as a criteria for selecting a sequence for use in a composition, e.g., vaccine. Sometimes, a DNA count of about 100 or higher for a nucleic acid sequence encoding a particular residue can be used in conjunction with the RF index of the residue to determine a sequence for use in a composition, e.g., vaccine. Sometimes, a DNA count of about 100 or higher and an RF index of above 0.1, 0.2, 0.3, 0.4, or 0.5 can be used as criteria to evaluate and select a mutant residue as a substitution for its respective WT residue in a peptide sequence. [0167] Sometimes, the sequences described herein can be incorporated into an engineered virus and the engineered virus can be used to screen against potential drug candidates. In some embodiments, in silico analysis can be utilized to screen for potential drug candidates for one or more of the peptide sequences illustrated in Tables 1-2 and peptide Formulas disclosed herein (see Example 1). [0168] In some embodiments, a composition, e.g., vaccine, described herein is used to treat a subject who has an influenza infection, such as an influenza A virus infection, an influenza B virus infection, or an influenza C virus infection. In some embodiments, the composition, e.g., vaccine, is used in a vaccination method against the infection of influenza A virus, influenza B virus, or influenza C virus. In some embodiments, the composition, e.g., vaccine, offers cross- protection against the different strains associated with the influenza A virus, the influenza B virus, and/or the influenza C virus. [0169] In some embodiments, a composition, e.g., vaccine, provided herein comprises, consists essentially of, or consists of one or more purified compounds (e.g., peptides or nucleic acid molecules) according to the present invention. As used herein, a "purified" peptide can mean that an amount of the macromolecular components that are naturally associated with the peptide have been removed from the peptide. As used herein, a composition comprising, consisting essentially of, or consisting of one or more purified peptides of the present invention can mean that the composition does not contain an amount of the macromolecular components that are naturally associated with the one or more peptides and/or the reagents used to synthesize the peptides. In some embodiments, the amount removed from the one or more peptides (or is not present in the composition) is at least about 60%, at least about 70%>, at least about 75%, at least about 80%>, at least about 85%, at least about 90%>, at least about 95%, at least about 96%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or 100% of the macromolecular components and/or reagents. In some embodiments, the composition is free of at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or 100% of the macromolecular components naturally associated with the one or more peptides and/or the reagents used to synthesize the one or more peptides. In some embodiments, the compositions of the present invention consist solely of one or more peptides according to the present invention, e.g., the one or more peptides in a solid or crystalized form. [0170] In some embodiments, the peptides and/or nucleic acid molecules of the present invention can be isolated. As used herein, an "isolated" compound (e.g., peptide, nucleic acid molecule) refers to a compound which is isolated from its native environment. For example, an isolated peptide can be one which does not have its native amino acids which correspond to the full length polypeptide, flanking the N-terminus, C-terminus, or both. As another example, an isolated peptide can be one which is immobilized to a substrate with which the peptide is not naturally associated. As a further example, an isolated peptide can be one which is linked to another molecule, e.g., a PEG compound, with which the peptide is not naturally associated. Similarly, an "isolated" nucleic acid molecule can be one which does not have its native nucleic acid basses which correspond to the full length nucleic acid molecule, flanking its 5' end, 3' end, or both. As another example, an isolated nucleic acid molecule can be one which is bound to a substrate or a compound, e.g., a label such as a fluorescent tag, with which the nucleic acid molecule is not naturally associated. As a further example, with respect to nucleic acid molecules, the term isolated can mean that it is separated from the nucleic acid and cell in which it naturally occurs.

[0171] 2b. Hepatitis [0172] In some embodiments, a composition, e.g., vaccine, developed with the methods described herein is used to target a hepatitis virus. Hepatitis can refer to an inflammation of the liver. Hepatitis can be acute with inflammation lasting less than six months, or chronic with inflammation lasting more than six months. Five hepatotropic viruses can cause hepatitis, termed hepatitis A, hepatitis B, hepatitis C, hepatitis D, or hepatitis E. Sometimes, hepatitis can be caused by mononucleosis (e.g., Epstein-Barr virus) or chickenpox (e.g., varicella virus). In some embodiments, the hepatitis A virus (HAV) is a picornavirus that can be transmitted through a fecal-oral route sometimes through ingestion of contaminated food. In some embodiments, the hepatitis D virus is a deltavirus similar to a and can be propagated in the presence of hepatitis B virus. In some embodiments, the hepatitis E virus is a hepevirus and in immune-compromised patients can induce chronic infection. [0173] In some embodiments, the hepatitis B virus (HBV) is an orthohepadnavirus. It can be a DNA virus whose genome can encode four proteins, a core protein termed HBcAg, a DNA polymerase, surface antigen termed HBsAg, and a protein whose function has yet to be elucidated. It is estimated that more than one-third of the world's population is infected with HBV, about 15%-25% of whom are at risk of developing HBV-associated liver diseases, including cirrhosis and hepatocellular carcinoma. HBV can be divided into four major serotypes (adr, adw, ayr, and ayw) which can be based on the antigenic epitopes present on its envelope proteins. HBV can be further classified into 8 genotypes, A, B, C, D, E, F, G, and H, based on the overall nucleotide sequence variation of the genome. The genotypes can differ from about 8% to about 14% and can have distinct geographical distributions. [0174] Type A can further be divided into two subtypes: Aa (Al) in Africa/Asia and the Philippines, and Ae (A2) in Europe/United States. Type B contains Bj/Bl and Ba/B2, wherein j is Japan and a stands for Asia. Type Ba can further be divided into clades, including B2, B3, and B4. Type C can be categorized as Cs (CI) in South-east Asia and Ce (C2) in East Asia. In addition, six clades, CI, C2, C3, C4, C5, and C6, can also be present within the C subtype. Type D can be divided into 7 subtypes, Dl, D2, D3, D4, D5, D6, and D7. Type F can be subdivided into 4 subtypes, Fl, F2, F3, and F4. Fl can further be divided into l a and lb. [0175] In some embodiments, vaccines described herein targets HBVs of type A, B, C, D, E, F, G, and/or H. In some embodiments, vaccines described herein targets HBVs of subtypes of type A, B, C, D, E, F, G, and/or H. In some embodiments, vaccines described herein targets HBVs of type Aa (Al), Ae (A2), Bj/Bl, Ba/B2, Ba/B3, Ba/B4, Cs (CI), Ce (C2), C3, C4, C5, C6, Dl, D2, D3, D4, D5, D6, D7, Fl, F2, F3, and/or F4.

[0176] In some embodiments, the hepatitis C virus (HCV) is a . It can be an R A virus whose genome can encode nine proteins: Core protein, El, E2, NS2, NS3, NS4A, NS4B, NS5A, and NS5B. HCV can be transmitted through blood or via the placenta, and can lead to chronic hepatitis, culminating in cirrhosis. Patients with hepatitis C can be susceptible to hepatitis A and/or hepatitis B.

[0177] Hepatitis C virus can be classified into seven genotypes: 1, 2, 3, 4, 5, 6, and 7. In general, about 30%-35% differences exist between the genotypes with respect to the complete genome. The differences between the subtypes within a genotype can be from about 20% to

about 25%. Subtypes l a and lb can cause up to 60%> of all hepatitis C virus infections.

[0178] In some embodiments, vaccines described herein targets HCVs of the genotypes 1, 2,

3, 4, 5, 6, and/or 7. Vaccines described herein can target HCVs of subtypes of the genotypes 1, 2, 3, 4, 5, 6, and/or 7. In some embodiments, vaccines described herein targets HCVs of subtypes l a and/or lb. [0179] In some embodiments, a vaccine for hepatitis is developed using one or more of the methods described herein. In some embodiments, a vaccine is developed using peptides identified from one or more of the hepatitis viral proteins. Sometimes, the peptides are at most 4,

5, 6 ,7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or less amino acid

residues in length. Sometimes, the peptides are at least 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or more amino acid residues in length. In some cases, a vaccine can also be developed using an antibody or fragment thereof that recognize a peptide, antigen presenting cells (APC) that are loaded with the peptide, a nucleic acid that encode the peptide, or a composition that comprises one or more of the peptide, APC, antibody or fragment thereof, or nucleic acid. [0180] In some embodiments, a vaccine for hepatitis A virus is developed using one or more of the methods described herein. In some embodiments, a vaccine is developed using peptides identified from one or more of the hepatitis A viral proteins. Sometimes, the peptides are at most 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or less amino acid residues in length. Sometimes, the peptides are at least 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or more amino acid residues in length. In some cases, a vaccine can also be developed using an antibody or fragments thereof that recognizes a peptide, antigen presenting cells (APC) that are loaded with the peptide, a nucleic acid that encode the peptide, or a composition that comprises one or more of the peptide, APC, antibody or fragment thereof, or nucleic acid. [0181] In some embodiments, a vaccine for hepatitis B virus is developed using one or more of the methods described herein. In some embodiments, a vaccine is developed using peptides identified from one or more of the hepatitis B viral proteins. Sometimes, the peptides are at most 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or less amino acid residues in length. Sometimes, the peptides are at least 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or more amino acid residues in length. In some cases, a vaccine can also be developed using an antibody or fragment thereof that recognize a peptide, antigen presenting cells (APC) that are loaded with the peptide, a nucleic acid that encode the peptide, or a composition that comprises one or more of the peptide, APC, antibody or fragment thereof, or nucleic acid. [0182] In some embodiments, a vaccine for hepatitis C virus is developed using one or more of the methods described herein. In some embodiments, a vaccine is developed using peptides identified from one or more of the hepatitis C viral proteins. Sometimes, the peptides are at most 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or less amino acid residues in length. Sometimes, the peptides are at least 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or more amino acid residues in length. In some cases, a vaccine can also be developed using an antibody or fragment thereof that recognize a peptide, antigen presenting cells (APC) that are loaded with the peptide, a nucleic acid that encode the peptide, or a composition that comprises one or more of the peptide, APC, antibody or fragment thereof, or nucleic acid. [0183] In some embodiments, a vaccine described herein is used to treat a patient who has a hepatitis infection, such as an HAV infection, an HBV infection, or an HCB infection. Sometimes, a vaccine described herein can be used as a vaccination method against the infection of HAV, HBV, or HCV. Sometimes, a vaccine described herein offers cross-protection against the different strains associated with HAV, HBV, and/or HCV. [0184] Sometimes, a patient having HBV infection also has a HBV-associated disorder. HBV-associated disorders may include cirrhosis, fulmunant hepatic failure, hepatocellular carcinoma, spider angiomata, splenomegaly, gynecomastia, renal diseases (e.g., proteinuria, hematuria), heart diseases (e.g., pericarditis, congestive heart failure), gastrointestinal diseases (e.g., acute abdominal pain and bleeding), skin lesions, and neurological disorders (e.g., mononeuritis multiplex, central nervous system abnormalities). In some embodiments, a vaccine described above is used to treat a patient having an HBV infection and one or more HBV-associated disorders. [0185] In some cases, a patient having HCV infection also has a HCV-associated disorder. HCV-associated disorders may include cirrhosis, cryoglobulinemia, skin disorders (e.g., lichen planus, porphyria cutanea tarda), non-Hodgkin's lymphoma, and type 2 diabetes. In some embodiments, a vaccine described above can be used to treat a patient having an HCV infection and one or more HCV-associated disorders.

[0186] 2c. Bacteria and Fungi [0187] In some embodiments, a vaccine prepared using one or more of the methods described herein is developed to treat a patient who has a bacterial infection or a fungal infection. Examples of bacteria include: Helicobacter pyloris, Borelia burgdorferi, Legionella pneumophilia, Mycobacteria spp. (e.g., M. tuberculosis, M. avium, M. intracellulare, M. kansasii, M. gordonae), Staphylococcus aureus, Neisseria gonorrhoeae, Neisseria meningitidis, Listeria monocytogenes, Streptococcus pyogenes (Group A Streptococcus), Streptococcus agalactiae (Group B Streptococcus), Streptococcus (viridans group), Streptococcus faecalis, Streptococcus bovis, Streptococcus (anaerobic spp.), Streptococcus pneumoniae, pathogenic Campylobacter sp., Enterococcus sp., Haemophilus influenzae, Bacillus anthracis, Corynebacterium diphtheriae, Corynebacterium sp., Erysipelothrix rhusiopathiae, Clostridium perfringens, Clostridium tetani, Enterobacter aerogenes, Klebsiella pneumoniae, Pasturella multocida, Bacteroides sp., Fusobacterium nucleatum, Streptobacillus moniliformis, Treponema pallidum, Treponema pertenue, Leptospira, and Actinomyces israelii. [0188] Examples of fungi include: Cryptococcus neoformans, Histoplasma capsulatum, Coccidioides immitis, Blastomyces dermatitidis, Chlamydia trachomatis, and Candida albicans. Other infectious organisms (i.e., ) include: Plasmodium falciparum and Toxoplasma gondii. [0189] Sometimes, a vaccine using one or more of the methods described herein can be developed for use in a vaccination method against a bacterial infection or a fungal infection. Sometimes, a vaccine using one or more of the methods described herein can offer cross- protection against multiple species of bacteria or fungi.

[0190] 3. METHODS OF DETERMINING PEPTIDES/EPITOPES FOR USE IN VACCINE DEVELOPMENT

[0191] As illustrated in Fig. 1, the methods described herein for generation of the nucleic acid library (101) can allow simulation of all possible mutations that can occur in a particular pathogen strain. The library (101) can be generated in any cell (e.g., host cell) system or in a cell-free system utilizing any cells (e.g., host cells) or cellular components, vectors, reagents, and mutagenesis methodologies that are well known in the art, such as a stochastic method (e.g., error-prone methodology) or a non-stochastic method (e.g., site-directed or saturated site- directed mutagenesis). [0192] The cell (e.g., host cell) can include any suitable cell such as a naturally derived cell or a genetically modified cell. The host cell can be a eukaryotic cell or a prokaryotic cell. A eukaryotic cell can include fungi (e.g., yeast cells), animal cell, or cell. The prokaryotic cell can be bacterial cell. A bacterial cell can be a gram-positive bacterium or a gram-negative bacterium. Sometimes the gram-negative bacteria is anaerobic, rod-shaped, or both. [0193] The gram-positive bacteria can be Actinobacteria, Firmicutes, or Tenericutes. The gram-negative bacteria can be Aquificae, Deinococcus-Thermus, Fibrobacteres- Chlorobi/Bacteroidetes (FCB group), Fusobacteria, Gemmatimonadetes, Nitrospirae, Planctomycetes-Verrucomicrobia/Chlamydiae (PVC group), Proteobacteria, Spirochaetes or Synergistetes. Other bacteria can be Acidobacteria, Chloroflexi, Chrysiogenetes, Cyanobacteria, Deferribacteres, Dictyoglomi, Thermodesulfobacteria, or Thermotogae. A bacterial cell can be Escherichia coli, Clostridium botulinum, or Coli bacilli. [0194] Exemplary prokaryotic host cells include BL21, Machl™, DH10B™, TOP10, DH5a, DHlOBac™, OmniMax™, MegaX™, DH12S™, INV1 10, TOP10F', INVaF, TOP10/P3, ccdB Survival, PIR1, PIR2, Stbl2™, Stbl3™, or Stbl4™. [0195] Animal cells can include a cell from a vertebrate or from an invertebrate. An animal cell can include a cell from a marine invertebrate, fish, insects, amphibian, reptile, or mammal. A fungus cell can include a yeast cell, such as brewer's yeast, baker's yeast, or wine yeast. [0196] Fungi include ascomycetes such as yeast, mold, filamentous fungi, basidiomycetes, or zygomycetes. Yeast can include Ascomycota or Basidiomycota. Ascomycota can include Saccharomycotina (true yeasts, e.g., Saccharomyces cerevisiae (baker's yeast)) or Taphrinomycotina (e.g., Schizosaccharomycetes (fission yeasts)). Basidiomycota can include Agaricomycotina (e.g., Tremellomycetes) or Pucciniomycotina (e.g., Microbotryomycetes). [0197] Yeast or filamentous fungi can include the genus: Saccharomyces, Schizosaccharomyces, Candida, Pichia, Hansenula, Kluyveromyces, Zygosaccharomyces, Yarrowia, Trichosporon, Rhodosporidi, Aspergillus, Fusarium, or Trichoderma. Yeast or filamentous fungi can include the species: Saccharomyces cerevisiae, Schizosaccharomyces pombe, Candida utilis, Candida boidini, Candida albicans, Candida tropicalis, Candida stellatoidea, Candida glabrata, Candida krusei, Candida parapsilosis, Candida guilliermondii, Candida viswanathii, Candida lusitaniae, Rhodotorula mucilaginosa, Pichia metanolica, Pichia angusta, Pichia pastoris, Pichia anomala, Hansenula polymorpha, Kluyveromyces lactis, Zygosaccharomyces rouxii, Yarrowia lipolytica, Trichosporon pullulans, Rhodosporidium toru- Aspergillus niger, Aspergillus nidulans, Aspergillus awamori, Aspergillus oryzae, Trichoderma reesei, Yarrowia lipolytica, Brettanomyces bruxellensis, Candida stellata, Schizosaccharomyces pombe, Torulaspora delbrueckii, Zygosaccharomyces bailii, Cryptococcus neoformans, Cryptococcus gattii, or Saccharomyces boulardii. [0198] Exemplary yeast host cells include Pichia pastoris yeast strains such as GS1 15, KM71H, SMD1 168, SMD1 168H, and X-33; and Saccharomyces cerevisiae yeast strain such as INVScl. [0199] Additional animal cells can be of a mollusk, arthropod, annelid, or sponge. A mammalian cell can be of a primate, ape, equine, bovine, porcine, canine, feline, or rodent. The mammal can be a primate, ape, dog, cat, rabbit, ferret, or the like. The rodent can be a mouse, rat, hamster, gerbil, hamster, chinchilla, fancy rat, or guinea pig. The bird cell can be of a canary, parakeet, or parrots. The reptile cell can be of a turtles, lizard, or snake. The fish cell can of a tropical fish. The fish cell can be of a zebrafish (e.g., Danino rerio). The worm cell can

be of a nematode (e.g., C. elegans). The amphibian cell may be of a frog. The arthropod cell may be of a tarantula or hermit crab.

[0200] Exemplary mammalian host cells include 293T cell line, 293A cell line, 293FT cell line, 293F cells , 293 H cells, A549 cells, MDCK cells, CHO DG44 cells, CHO-S cells, CHO- Kl cells, Expi293F™ cells, Flp-In™ T-REx™ 293 cell line, Flp-In™-293 cell line, Flp-In™- 3T3 cell line, Flp-In™-BHK cell line, Flp-In™-CHO cell line, Flp-In™-CV-l cell line, Flp- In™-Jurkat cell line, FreeStyle™ 293-F cells, FreeStyle™ CHO-S cells, GripTite™ 293 MSR cell line, GS-CHO cell line, HepaRG™ cells, T-REx™ Jurkat cell line, Per.C6 cells, T-REx™- 293 cell line, T-REx™-CHO cell line, and T-REx™-HeLa cell line. [0201] The mammalian host cell can be a stable cell line, or a cell line that has incorporated a genetic material of interest into its own genome and has the capability to express the product of the genetic material after many generations of cell division. The mammalian host cell can be a transient cell line, or a cell line that has not incorporated a genetic material of interest into its own genome and does not have the capability to express the product of the genetic material after many generations of cell division. [0202] Exemplary insect host cell include Drosophila S2 cells, Sf9 cells, Sf21 cells, High Five™ cells, and expresSF+® cells. [0203] Plant cells can include a cell from algae. Exemplary insect cell lines include strains from Chlamydomonas reinhardtii 137c, or Synechococcus elongatus PPC 7942. [0204] Vectors can include any suitable vectors derived from either a eukaryotic or prokaryotic sources. Vectors can be from bacteria (e.g., E. coli), insects, yeast (e.g., Pichia pastoris), algae, or mammalian source. Bacterial vectors can include pACYC177, pASK75, pBAD vector series, pBADM vector series, pET vector series, pETM vector series, pGEX vector series, pHAT, pHAT2, pMal-c2, pMal-p2, pQE vector series, pRSET A, pRSET B, pRSET C, pTrcHis2 series, pZA31-Luc, pZE21-MCS-l, pFLAG ATS, pFLAG CTS, pFLAG MAC, pFLAG Shift- 12c, pTAC-MAT-1, pFLAG CTC, or pTAC-MAT-2. [0205] Insect vectors can include pFastBacl, pFastBac DUAL, pFastBac ET, pFastBac HTa, pFastBac HTb, pFastBac HTc, pFastBac M30a, pFastBact M30b, pFastBac, M30c, pVL1392, pVL1393, pVL1393 M10, pVL1393 M i l , pVL1393 M12, FLAG vectors such as pPolh-FLAGl or pPolh-MAT 2, or MAT vectors such as pPolh-MATl, or pPolh-MAT2. [0206] Yeast vectors can include Gateway®pDEST™ 14 vector, Gateway® pDEST™ 15 vector, Gateway®pDEST™ 17 vector, Gateway®pDE ST™ 24 vector, Gateway®pYES-DEST52 vector, pBAD-DEST49 Gateway® destination vector, pA0815 Pichia vector, pFLDl Pichi pastoris vector, pGAPZA, B, & C Pichia pastoris vector, pPIC3.5K Pichia vector, pPIC6 A, B, & C Pichia vector, pPIC9K Pichia vector, pTEFl/Zeo, pYES2 yeast vector, pYES2/CT yeast vector, pYES2/NT A, B, & C yeast vector, or pYES3/CT yeast vector.

[0207] Algae vectors can include pChlamy-4 vector or MCS vector. [0208] Mammalian vectors can include transient expression vectors or stable expression vectors. Mammalian transient expression vectors may include p3xFLAG-CMV 8, pFLAG- Myc-CMV 19, pFLAG-Myc-CMV 23, pFLAG-CMV 2, pFLAG-CMV 6a,b,c, pFLAG-CMV 5.1, pFLAG-CMV 5a,b,c, p3xFLAG-CMV 7.1, pFLAG-CMV 20, p3xFLAG-Myc-CMV 24, pCMV-FLAG-MAT 1, pCMV-FLAG-MAT2, pBICEP-CMV 3, or pBICEP-CMV 4. Mammalian stable expression vector may include pFLAG-CMV 3, p3xFLAG-CMV 9, p3xFLAG-CMV 13, pFLAG-Myc-CMV 21, p3xFLAG-Myc-CMV 25, pFLAG-CMV 4, p3xFLAG-CMV 10, p3xFLAG-CMV 14, pFLAG-Myc-CMV 22, p3xFLAG-Myc-CMV 26, pBICEP-CMV 1, or pBICEP-CMV 2. [0209] A cell-free system can be a mixture of cytoplasmic and/or nuclear components from a cell and can be used for in vitro nucleic acid synthesis. The cell-free system can utilize either prokaryotic cell components or eukaryotic cell components. Nucleic acid synthesis can be obtained in a cell-free system based on for example Drosophila cell, Xenopus egg, or HeLa cells. Exemplary cell-free systems include E. coli S30 Extract system, E. coli T7 S30 system, or PURExpress®.

[0210] 3a. Mutagenesis methodsfor generating a nucleic acid library [0211] 3al. Random mutagenesis [0212] A stochastic mutagenesis method can be a random mutagenesis method. Random mutagenesis can be a method of generating a library of protein mutants with different functional properties. For example, random mutations can be first introduced into a gene to generate a library containing thousands of different versions of this gene. Each version or variant of this gene can then expressed and the property of each expressed protein can then evaluated for function. Random mutagenesis can be achieved using error-prone PCR, rolling circle error- prone PCR, mutator strains, temporary mutator strains, insertion mutagenesis, ethyl methanesulfonate, nitrous acid, or DNA shuffling. Random mutagenesis can be generated using, such as UV, ionizing radiation, X-ray, gamma rays, or by chemical agents, such as mustard gas, cyclophosphamide, or cisplatin. [0213] Error-prone PCR can be a PCR method in which a polymerase has a high error rate, in some cases, up to 2%, during amplification of a wild-type sequence. Point mutations or single nucleotide mutations can be a common type of mutation in error prone PCR. In rolling circle error-prone PCR, a wild-type sequence can be first cloned into a plasmid, and then the whole plasmid can be amplified under error-prone PCR condition. A mutator strain approach can utilize a mutator strain such as XL1-Red (Strategene) which is an E. coli strain that can be deficient in three DNA repair pathways (mutS, mutD, and mutT) and therefore can cause it to make errors during replication. The temporary mutator strains method can be deficient in one DNA repair pathway (mutD5), instead of three DNA repair pathways. An insertion mutagenesis can utilize a transposon-based system to randomly insert a 15-base sequence throughout a sequence of interest. An ethyl methanesulfonate (EMS) approach can utilize the chemical EMS to alkylate guanidine residues, thereby causing them to be incorrectly copied during DNA replication. Nitrous acid is a second chemical mutagen that can introduce mutations by deaminating adenine and cytosine residues, thereby causing transversion point mutations. DNA shuffling approach can be achieved through randomly digesting the sequence of interest or a sequence library with DNAsel and then randomly re-joining the fragments using self-priming PCR. [0214] In some cases, an error-prone PCR method is used for generation of a nucleic acid library described herein. Sometimes, an error-prone PCR method is used to generate a library comprising of nucleic acids derived from a virus. An error-prone PCR method can be used to generate a library comprising of nucleic acids derived from an RNA virus, such as influenza A virus, influenza B virus, influenza C virus, hepatitis C virus, HIV-1, or HIV-2. An error-prone PCR method can be used to generate a library comprising of nucleic acids derived from a DNA virus, such as hepatitis B virus.

[0215] 3a2. Non-random mutagenesis [0216] A non-stochastic method can be a non-random mutagenesis method. One example is a site-directed mutagenesis. Site-directed mutagenesis can be a method that can allow specific alterations or modifications within the gene of interest. In some instances, a site-directed mutagenesis can utilize cassette mutagenesis, PCR-site-directed mutagenesis, whole plasmid mutagenesis, Kunkel's method, or in vivo site-directed mutagenesis method. A cassette mutagenesis can allow for synthesized fragments of DNA to be inserted into a plasmid using a restriction enzyme and ligation method. In some cases, it does not involve polymerization. A PCR site-directed mutagenesis can be similar to the cassette mutagenesis, but in which larger fragments can be obtained, separated by gel electrophoresis from the template fragments, and then ligated into the gene of interest. Whole plasmid mutagenesis, such as the Quikchange® method, can allow for mutations to be inserted using one or more primers and then amplifies the entire plasmid. This method can differ from the PCR site-directed mutagenesis in that the plasmid is in a linear format and that it does not need to be exponentially amplified as in a PCR. Kunkel's method can be a primer based site directed method. It can differ from the previous methods in that it can utilize an E. coli strain that is deficient in dUTPase, an enzyme that can prevent the bacteria from incorporating uracil during DNA replication, to distinguish between product and template strains thereby allowing for easier selection of plasmids containing the desired mutation. An in vivo site-directed mutagenesis method can include the Delitto perfetto method, transplacement "pop-in pop-out" method, direct gene deletion and site-specific mutagenesis with PCR and one recyclable marker, direct gene deletion and site-specific mutagenesis with PCR and one recyclable marker using long homologous regions, or in vivo site-directed mutagenesis with synthetic oligonucleotides. These methods can be under in vivo conditions. [0217] Sometimes, a site-specific PCR method is used to generate a library comprising of nucleic acids derived from a virus. A site-specific PCR method can be used to generate a library comprising of nucleic acids derived from an RNA virus, such as influenza A virus, influenza B virus, influenza C virus, hepatitis C virus, HIV-1, or HIV-2. A site-specific PCR method can be used to generate a library comprising of nucleic acids derived from a DNA virus, such as hepatitis B virus. [0218] Another example of a non-stochastic method is site-saturation mutagenesis, which can allow mutation samplings of up to 19 additional different amino acid variations at each amino acid position. This method can utilize degenerate codons for introduction of mutations. For example, a primer may contain a degenerate codon NNN, where N= A, C, G, or T, to generate a progeny library containing 64 different sequences (also known as 64-fold degeneracy) encoding all 20 amino acids. Additional degenerate codons can include NNK, where K is T or G, to generate a progeny library containing 32 different sequences (also known as 32-fold degeneracy), or NDT, where D is A, G, or T, to generate a progeny library containing 12 different sequences (also known as 12-fold degeneracy). [0219] A site-saturation mutagenesis method can be used to generate a library comprising of nucleic acids derived from a virus. A site- saturation mutagenesis method can be used to generate a library comprising of nucleic acids derived from an RNA virus, such as influenza A virus, influenza B virus, influenza C virus, hepatitis C virus, HIV-1, or HIV-2. A site-saturation mutagenesis method can be used to generate a library comprising of nucleic acids derived from a DNA virus, such as hepatitis B virus.

[0220] 3b. Viralproduction and sequencing methodsfor producing a second sequencing library

[0221] As shown in Fig. 1, the nucleic acid library can be introduced into any cells that support viral replication to generate a set of viruses (102). Recombinant virus generation methods are well known in the art.

[0222] 3bl. Cell transfection [0223] In some cases, the methods can involve one or more transfection steps and one or more infection steps. Cells from cell lines such as 293T, Madin-Darby canine kidney (MDCK), MDBK, or C227 (293T cells stably expressing a dominant negative IRF3) may be used for transfection to produce one or more of the viruses described herein. Any cell lines can be used for the infection step to produce one or more viruses described herein. Additional cell lines include, mammalian cell lines such as 293A cell line, 293FT cell line, 293F cells , 293 H cells, CHO DG44 cells, CHO-S cells, CHO-K1 cells, Expi293F™ cells, Flp-In™ T-REx™ 293 cell line, Flp-In™-293 cell line, Flp-In™-3T3 cell line, Flp-In™-BHK cell line, Flp-In™-CHO cell line, Flp-In™-CV-l cell line, Flp-In™-Jurkat cell line, FreeStyle™ 293-F cells, FreeStyle™ CHO-S cells, GripTite™ 293 MSR cell line, GS-CHO cell line, HepaRG™ cells, T-REx™ Jurkat cell line, Per.C6 cells, T-REx™-293 cell line, T-REx™-CHO cell line, T-REx™-HeLa cell line, 3T6, A549, A9, AtT-20, BALB/3T3, BHK-21, BHL-100, BT, Caco-2, Chang, Clone 9, Clone M-3, COS-1, COS-3, COS-7, CRFK, CV-1, D-17, Daudi, GH1, GH3, H9, HaK, HCT-15, HEp-2, HL-60, HT-1080, HT-29, HUVEC, 1-10, IM-9, JEG-2, Jensen, K-562, KB, KG-1, L2, LLC-WRC 256, McCoy, MCF7, VERO, WI-38, WISH, XC, or Y-l; insect cell lines such as Drosophila S2 cells, Sf9 cells, Sf21 cells, High Five™ cells, or expresSF+® cells. In some instances, viruses can be produced in algae cells, such as Phaeocystis pouchetii.

[0224] In some cases, the methods described herein involves at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, or more transfection steps. Sometimes, the methods

described herein involves at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90,

100, or less transfection steps. The methods described herein can involve at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, or more infection steps. The methods

described herein can involve at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, or less infection steps.

[0225] 3b2. Determining sequence [0226] The sequences obtained from the viruses can be sequenced using a next-generation sequencing (NGS) method (103), and then compared with the sequences from the nucleic acid library (104) and/or with sequences from a prior passage. Next-generation sequencing techniques include, for example, Helicos True Single Molecule Sequencing (tSMS) (Harris T.D. et al. (2008) Science 320:106-109); 454 sequencing (Roche) (Margulies, M. et al. 2005, Nature, 437, 376-380); SOLiD technology (Applied Biosystems); SOLEXA sequencing (Illumina); single molecule, real-time (SMRT™) technology of Pacific Biosciences; nanopore sequencing (Soni GV and Meller A. (2007) Clin Chem 53: 1996-2001); semiconductor sequencing (Ion Torrent; Personal Genome Machine); DNA nanoball sequencing; sequencing using technology from Dover Systems (Polonator), and technologies that do not require amplification or otherwise transform native DNA prior to sequencing (e.g., Pacific Biosciences and Helicos), such as nanopore-based strategies (e.g., Oxford Nanopore, Genia Technologies, and Nabsys). [0227] In some instances, the next generation sequencing technique is 454 sequencing (Roche), which can involve amplification of nucleic acids on beads and pyrosequencing (see e.g., Margulies, M et al. (2005) Nature 437: 376-380). [0228] In some embodiments, the next generation sequencing technique is SOLiD technology (Applied Biosystems; Technologies). In SOLiD sequencing, genomic DNA can be sheared into fragments, and adaptors can be attached to the 5' and 3' ends of the fragments to generate a fragment library. Alternatively, internal adaptors can be introduced by ligating adaptors to the 5' and 3' ends of the fragments, circularizing the fragments, digesting the circularized fragment to generate an internal adaptor, and attaching adaptors to the 5' and 3' ends of the resulting fragments to generate a mate-paired library. Next, clonal bead populations can be prepared in microreactors containing beads, primers, template, and PCR components. Following PCR, the templates can be denatured and beads can be enriched to separate the beads with extended templates. Templates on the selected beads can be subjected to a 3' modification that permits bonding to a glass slide. A sequencing primer can bind to adaptor sequence. A set of four fluorescently labeled di-base probes can compete for ligation to the sequencing primer. Specificity of the di-base probe can be achieved by interrogating every first and second base in each ligation reaction. The sequence of a template can be determined by sequential hybridization and ligation of partially random oligonucleotides with a determined base (or pair of bases) that can be identified by a specific fluorophore. After a color is recorded, the ligated oligonucleotide can be cleaved and removed and the process can be then repeated. Following a series of ligation cycles, the extension product can be removed and the template can be reset with a primer complementary to the n-1 position for a second round of ligation cycles. Five rounds of primer reset can be completed for each sequence tag. Through the primer reset process, most of the bases can be interrogated in two independent ligation reactions by two different primers. Up to 99.99% accuracy can be achieved by sequencing with an additional primer using a multi-base encoding scheme.

[0229] In some embodiments, the next generation sequencing technique is SOLEXA sequencing (ILLUMINA sequencing). ILLUMINA sequencing can be based on the amplification of DNA on a solid surface using fold-back PCR and anchored primers. ILLUMINA sequencing can involve a library preparation step. Genomic DNA can be fragmented, and sheared ends can be repaired and adenylated. Adaptors can be added to the 5' and 3' ends of the fragments. The fragments can be size selected and purified. ILLUMINA sequence can comprise a cluster generation step. DNA fragments can be attached to the surface of flow cell channels by hybridizing to a lawn of oligonucleotides attached to the surface of the flow cell channel. The fragments can be extended and clonally amplified through bridge amplification to generate unique clusters. The fragments become double stranded, and the double stranded molecules can be denatured. Multiple cycles of the solid-phase amplification followed by denaturation can create several million clusters of approximately 1,000 copies of single-stranded DNA molecules of the same template in each channel of the flow cell. Reverse strands can be cleaved and washed away. Ends can be blocked, and primers can by hybridized to DNA templates. ILLUMINA sequencing can comprise a sequencing step. Hundreds of millions of clusters can be sequenced simultaneously. Primers, DNA polymerase and four fluorophore-labeled, reversibly terminating nucleotides can be used to perform sequential sequencing. All four bases can compete with each other for the template. After nucleotide incorporation, a laser can be used to excite the fluorophores, and an image is captured and the identity of the first base is recorded. The 3' terminators and fluorophores from each incorporated base are removed and the incorporation, detection, and identification steps are repeated. A single base can be read each cycle. In some embodiments, a HiSeq system (e.g., HiSeq 2500, HiSeq 1500, HiSeq 2000, or HiSeq 1000) is used for sequencing. In some embodiments, a MiSeq personal sequencer is used. In some embodiments, a NextSeq system is used. In some embodiments, a Genome Analyzer IIx is used. [0230] In some embodiments, the next generation sequencing technique comprises real-time (SMRT™) technology by Pacific Biosciences, which can use DNA bases attached to one of four different phospholinked fluorescent dyes and a single DNA polymerase immobilized with a single molecule of template single stranded DNA at the bottom of a zero-mode waveguide (ZMW) (see e.g., US Patent Nos. 8906831, 8367159, 8583380; U.S. Patent Application No. 20130294972). [0231] Sometimes, the next generation sequencing is nanopore sequencing; e.g., a DNA molecule passes through a nanopore, each nucleotide on the DNA molecule can obstruct the nanopore to a different degree, and the change in the current passing through the nanopore as the DNA molecule passes through the nanopore can represent a reading of the DNA sequence (See e.g., Soni GV and Meller A. (2007) Clin Chem 53: 1996-2001; Garaj et al. (2010) Nature vol. 67, doi:10.1038/nature09379; US Patent Application Publication Nos. 20140174927, 20140266147; 20130244340; 20120052188, 20070190542, 20060063171. The nanopore sequencing system can be one developed or commercialized by, e.g., Oxford Nanopore Technologies (MinilON™, GridlON™, PromethlON™ ); Genia Technologies (nanopore embedded in lipid bilayer membrane), NBsys (sequencing-by-hybridization approach), or IBM/Roche (use of electron beam to make a nanopore sized opening in a microchip). [0232] In some embodiments, the next generation sequencing comprises ion semiconductor sequencing (e.g., using technology from Life Technologies (Ion Torrent)). Ion semiconductor sequencing can take advantage of the fact that when a nucleotide is incorporated into a strand of DNA, an ion can be released, which can be measured as a change in pH (see e.g., U.S. Patent Application Publication Nos. 20100301398, 20100300895, 20100300559, 20100197507, 20100137143, 20090127589, or 2009/0026082; Anderson et al, Sensors and Actuators B Chem., 129: 79-86 (2008); and Pourmand et al, Proc. Natl. Acad. Sci., 103: 6466-6470 (2006)). In some embodiments, an ION PROTON™ ION PGM™, or IonChef™ Sequencer is used. [0233] In some embodiments, the next generation sequencing is DNA nanoball sequencing (as performed, e.g., by Complete Genomics; see e.g., Drmanac et al. (2010) Science 327: 78-81, US Patent Application Publication No. 20130059740). [0234] In some embodiments, the next generation sequencing technique is Helicos True Single Molecule Sequencing (tSMS) (see e.g., Harris T. D. et al. (2008) Science 320:106-109). [0235] In some embodiments, the sequencing technique comprises paired-end sequencing in which both the forward and reverse template strand can be sequenced. In some embodiments, the sequencing technique can comprise mate pair library sequencing. In mate pair library sequencing, DNA can be fragments, and 2-5 kb fragments can be end-repaired (e.g., with biotin labeled dNTPs). The DNA fragments can be circularized, and non-circularized DNA can be removed by digestion. Circular DNA can be fragmented and purified (e.g., using the biotin labels). Purified fragments can be end-repaired and ligated to sequencing adaptors. [0236] The sequences obtained from the viruses can be sequenced using sequencing methods, such as Sanger sequencing, Maxam-Gilbert sequencing, Shotgun sequencing, bridge PCR, mass spectrometry based sequencing, microfluidic based Sanger sequencing, microscopy- based sequencing, RNAP sequencing, or hybridization based sequencing, and then compared with the sequences from the nucleic acid library (104) and/or sequences from a prior passage. Sanger sequencing can utilize a chain-termination method which relies on selective incorporation of chain-terminating dideoxynucleotides by DNA polymerases during replication. Maxam-Gilbert sequencing can utilize chemical modification of DNA and subsequent cleavage at specific bases. In a shotgun sequencing method, DNA can be randomly fragmented and then sequenced using chain termination methods to obtain reads. Multiple overlapping reads can be pooled and assembled into a continuous sequence. In a bridge PCR method, DNA can be fragmented and then amplified by solid surface tethered primers to form "DNA colonies" or "DNA clusters". Multiple overlapping "DNA colonies" or "DNA clusters" can be pooled and assembled into a continuous sequence. In a mass spectrometry-based sequencing, DNA fragments can be generated by chain-termination sequencing methods and the fragments can be read by mass spectrometries such as matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS). In a microfluidic Sanger sequencing method, amplification of the DNA fragments and their separation can be achieved on a single glass wafer. In a microscopy-based method, electron microscopy such as transmission electron microscopy DNA sequencing can be used to visualize DNA bases labeled with heavy atoms. A RNAP sequencing method can utilize the distinct motions that RNA polymerase generates during transcription of each nucleotide base and can generate a sequence based on this motion. A hybridization-based sequencing can utilize a DNA microarray in which a single pool of DNA of interest is fluorescently labeled and hybridized to an array containing known sequences. Strong hybridization signals from a particular spot on the array can allow identification of the sequence of the DNA of interest. [0237] Amplification methodologies can be used to amplify the nucleic acid sequences. Exemplary amplification methodologies include polymerase chain reaction (PCR), nucleic acid sequence based amplification (NASBA), self-sustained sequence replication (3SR), isothermal amplification, isothermal linear amplification, loop mediated isothermal amplification (LAMP), strand displacement amplification (SDA), whole genome amplification, multiple displacement amplification, strand displacement amplification, helicase dependent amplification, nicking enzyme amplification reaction, recombinant polymerase amplification, reverse transcription PCR, ligation mediated PCR, digital PCR (dPCR), droplet digital PCR (ddPCR), or methylation specific PCR. [0238] Additional methods that can be used to obtain a nucleic acid sequence include, e.g., array-based comparative genomic hybridization, detecting single nucleotide polymorphisms (SNPs) with arrays, subtelomeric fluorescence in situ hybridization (ST-FISH) (e.g., to detect submicroscopic copy-number variants (CNVs)), DNA microarray, high-density oligonucleotide microarray, whole-genome RNA expression array, peptide microarray, enzyme-linked immunosorbent assay (ELISA), genome sequencing, de novo sequencing, polony sequencing, copy number variation (CNV) analysis sequencing, small nucleotide polymorphism (SNP) analysis, immunohistochemistry (IHC), immunoctyochemistry (ICC), mass spectrometry, tandem mass spectrometry, matrix-assisted laser desorption ionization time of flight mass spectrometry (MALDI-TOF MS), in-situ hybridization, fluorescent in-situ hybridization (FISH), chromogenic in-situ hybridization (CISH), silver in situ hybridization (SISH), digital PCR (dPCR), reverse transcription PCR, quantitative PCR (Q-PCR), single marker qPCR, real-time PCR, nCounter Analysis (Nanostring technology), Western blotting, Southern blotting, SDS- PAGE, gel electrophoresis, and Northern blotting.

[0239] 3c. Identification and evaluation of invariantpeptide regions [0240] Sequences obtained from 103 can be compared with the nucleic acid library 101 and/or sequences from a prior passage to identify invariant amino acid residues of a given pathogen. Analysis of the data can be carried out using sequence analysis programs such as BFAST or Burrows-Wheeler Aligner (BWA) for each set of sequences from the nucleic acid library or generated from the viruses. A criteria associated with BWA can be a maximum of six mismatches and/or no gaps. True mutations, or mutations that contain at least 60%, 70%, 80%, 90%, 95% or more in occurrence frequency within a clonal set as defined by identical barcodes on the adaptor and same fragment sequence may be highlighted and false mutations that contain at most 30% , 20%>, 10%>, 5%, or less in occurrence frequency within a clonal set as defined by identical barcodes on the adaptor and same fragment sequence can be filtered from the data. Invariance, e.g., relative fitness (RF) index, can be calculated for individual point mutation based on the equation: (occurrence frequency in passaged virus) / (occurrence frequency in nucleic acid library or in a previous passage). [0241] An invariance value can be assigned based on the equation to each nucleic acid position and can also be correlated to each amino acid residue within a pathogen. For example, an occurrence frequency of a nucleic acid sequence in the nucleic acid library 101 can be calculated based on criteria such as the 5' sequence of the nucleic acid, the 3' sequence of the nucleic acid, a label or a barcode attached to the nucleic acid if the nucleic acid is sequenced using a sequencing technology such as a SOLiD sequencing method, and/or the orientation of the nucleic acid with respect to an adaptor if using a sequencing technology such as a SOLiD sequencing method. Therefore for example, if 9 or 10 out of 10 sequence readings show a modification or variant at a given position, this occurrence frequency can be a true mutation. If 1 or 2 out of 10 sequence readings show a modification or variant at a given position, this occurrence frequency can be a sequencing error and can be filtered from the data. Additional sequencing errors can be filtered during such as the amplification step, and/or the polymerization recovery step. [0242] A non-random fragmentation can occur in the nucleic acid library 101 . As such, a binomial exact test (null hypothesis =1%), can be used to assign a p-value for each modification or variant in each sequence reading, and a cutoff can be set for a true mutation at for example p < 10 4 . For example, if 9 out of 28 sequence readings show a modification or variant at a given position, this occurrence frequency can be a true mutation if the p < 10 4 threshold can be satisfied. If 1 or 2 out of 28 sequence readings show a modification or variant at a given position, this occurrence frequency can be a sequencing error if the p < 10 4 threshold cannot be satisfied. Occurrence frequency can be normalized and invariance can be calculated for each point mutation using (normalized occurrence frequency in passaged virus) / (normalized occurrence frequency in nucleic acid library or in a previous passage). [0243] A value of (normalized occurrence frequency in passaged virus) / (normalized occurrence frequency in nucleic acid library or in a previous passage) can be assigned to a position in a nucleic acid sequence. A value of between about 0 and about 5 can be assigned to a position in a nucleic acid sequence. A value of between about 0 and about 0.05 can be assigned as "strongly attenuated" to indicate the propagation property of the mutation at a nucleic acid position. A "strongly attenuated" mutation can include a lethal mutation or a mutation that impairs the propagation of a pathogen. A "strongly attenuated" mutation can impair the propagation of a pathogen by about 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 100% relative to the propagation of a native pathogen. Sometimes, a "strongly attenuated" mutation can impair the propagation of a pathogen by about 95% to 100% relative to the propagation of a native pathogen. A value of between about 0.05 and 0.33 can be assigned as "moderately attenuated" at a nucleic acid position. A "moderately attenuated" mutation can include a mutation that impairs the propagation of a pathogen. A "moderately attenuated" mutation can impair the propagation of a pathogen by about 20%>, 30%>, 40%>, 45%, 50%>, 55%, 60%, 65%, 70%, 75%, 80%, 90%, or 95% relative to the propagation of a native pathogen. Sometimes, a "moderately attenuated" mutation can impair the propagation of a pathogen by about 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70% to about 95% relative to the propagation of a native pathogen. A value of between about 0.33 and 3 can be assigned as "neutral" at a nucleic acid position. A "neutral" mutation can include a mutation that impairs the propagation of a pathogen or a mutation that does not impairs the propagation of a pathogen. A "neutral" mutation can impair the propagation of a pathogen by about 0%, 5%, 10%, 15%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, or 80% relative to the propagation of a native pathogen. Sometimes, a "neutral" mutation can impair the propagation of a pathogen by about 0% to about 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, or 35% relative to the propagation of a native pathogen. A value of above 3 can be assigned as "enhanced" at a nucleic acid position. An "enhanced" mutation can include a mutation that increases the propagation of a pathogen. An "enhanced" mutation can increase the propagation of a pathogen by about 50%, 60%, 70%, 80%, 90%, 100%, 150%, 200%, 250%, 300%, 350%, 400%, 450%, 500% or more relative to the propagation of a native pathogen. Sometimes, an "enhanced" mutation can increase the propagation of a pathogen by about 300% or more relative to the propagation of a native pathogen.

[0244] 3 . Mutations [0245] In some instances, a mutation is a substitution, an insertion, a deletion, or a frameshift. A mutation can be a silent mutation. Sometimes, a mutation can generate a cis- regulatory element (CRE), or a trans-regulatory element. CRE is a region of non-coding DNA which can regulate or modify the transcription of nearby genes. Trans-regulatory element is a gene that can modify or regulate the transcription of distant genes. A "strongly attenuated" mutation can include a substitution, an insertion, a deletion, a frameshift, a silent mutation, a cis- regulatory element, or a trans-regulatory element. A "moderately attenuated" mutation can include a substitution, an insertion, a deletion, a frameshift, a silent mutation, a cis-regulatory element, or a trans-regulatory element. A "neutral" mutation can include a substitution, an insertion, a deletion, a frameshift, a silent mutation, a cis-regulatory element, or a trans- regulatory element. An "enhanced" mutation can include a substitution, an insertion, a deletion, a frameshift, a silent mutation, a cis-regulatory element, or a trans-regulatory element. A lethal mutation can include a substitution, an insertion, a deletion, a frameshift, a silent mutation, a cis- regulatory element, or a trans-regulatory element.

[0246] 3c2. HLA affinity binding analysis [0247] Regions containing one or more of the invariant value can be selected for sequence consensus analysis and/or HLA affinity binding analysis. Any sequence comparison software can be used for sequence consensus analysis. Sequence entropy calculations can be used to analyze and determine regions of sequence conservation. One example of a sequence entropy analysis method can be found in Strait and Dewey, "The Shannon information entropy of protein sequences" Biophysical Journal 71:148-155 (1996). [0248] HLA affinity binding analysis can be carried out using analysis programs such as NetMHCpan2.8 from the Center for Biological Sequences Analysis (CBS) at the Technical University of Denmark, HLA Peptide Binding Predictions server from the National Institute of Health, MHC-1 binding predictions server from the Immune Epitope Database (IEDB), and the like. [0249] A candidate peptide can have an HLA binding affinity of between about 1 pM and about 1 mM, about 100 pM and about 500 µΜ , about 500 pM and about 10 µΜ , about 1 nM and about 1 µΜ, or about 10 nM and about 1 µΜ . A candidate peptide can have an HLA binding affinity of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 700, 800, 900 µΜ, or more. A candidate peptide can have an HLA binding affinity of at most 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 700, 800, 900 µΜ , or less.

[0250] 3c3. Additional analysis [0251] Additional analysis can be carried out to select candidate peptides for vaccine development 105 and for administration to a patient for treatment 106. These additional analysis or screenings can involve analysis of the immune response based on immunological assays. In some cases, the test animals are first immunized (prime) with or without a second immunization (boost) following weeks after the prime and blood or tissue samples are collected for example two to four weeks after the last immunization. These studies can allow measurement of immune parameters that correlate to protective immunity, such as induction of specific antibodies (e.g., IgA, IgD, IgE, IgG, or IgM) and induction of specific T lymphocyte responses, in addition to determining whether an antigen or pools of antigens provides protective immunity. [0252] Spleen cells, lung cells, cells from mediastinal lymph nodes, or peripheral blood mononuclear cells can be isolated from immunized test animals and measured for the presence of antigen-specific T cells and induction of cytokine synthesis. ELISA, ELISPOT, or cytoplasmic cytokine staining, alone or combined with flow cytometry, can provide such information on a single-cell level. [0253] Immunological tests that can be used to identify the efficacy of immunization include antibody measurements, neutralization assays, and analysis of activation levels or frequencies of antigen presenting cells or lymphocytes that are specific for the antigen or pathogen. The test animals that can be used in such studies include mice, rats, guinea pigs, hamsters, rabbits, cats, dogs, pigs, monkeys, or humans. [0254] Monkey can be a useful test animal, e.g., due to the similarities of the MHC molecules between monkeys and humans. Virus neutralization assays can be useful for detection of antibodies that not only specifically bind to a pathogen, but also neutralize the function of the pathogen (e.g., virus). These assays are typically based on detection of antibodies in the sera of immunized animal and analysis of these antibodies for their capacity to inhibit pathogen (e.g., virus) growth in tissue culture cells. Such assays are known to those skilled in the art. One example of a virus neutralization assay is described by Dolin R (J. Infect. Dis. 1995, 172:1 175-83). Virus neutralization assays can be used to screen for antigens that also provide protective immunity.

[0255] 3c4. Exemplary properties of candidate peptides [0256] A candidate peptide selected for vaccine development can be at least 4, 5, 6 ,7, 8, 9,

10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 100 or more amino acid residues in length. A candidate peptide selected for vaccine development can be at most 4, 5, 6 ,7, 8, 9, 10,

11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 100 or less amino acid residues in length. [0257] A candidate peptide selected for vaccine development can have a pi value of about 0.5 and about 12, about 2 and about 10, or about 4 and about 8. A candidate peptide selected for vaccine development can have a pi value of at least 4.5, 5, 5.5, 6, 6.5, 7, 7.5, or more. A candidate peptide that has an HLA binding affinity can have a pi value of at most 4.5, 5, 5.5, 6, 6.5, 7, 7.5, or less. [0258] A composition can comprise a cocktail of any combination the peptides described herein. For example, a cocktail can comprises peptides based on one type of protein (e.g., PB) or one strain of virus or one type of virus (e.g., influenza A virus).

[0259] 3c5. Trials [0260] In some instances, a composition, e.g., a vaccine as provided herein, undergoes a preclinical trial prior to entrance into one or more clinical trials. The preclinical trial can last from about 4 to about 36 months, from about 6 to about 28 months, from about 12 to about 26 months, or from about 18 to about 24 months. Sometimes, the preclinical trial lasts from about

18 to about 24 months. The preclinical trials can incorporate events such as safety and toxicity studies, animal model studies, pre-IND meetings, and IND approvals. An animal model can be a mouse, rat, ferret, dog, monkey, or rabbit model. An animal model can be a mouse or ferret model. [0261] The clinical trial can include a phase I clinical trial and/or phase II clinical trial. In a phase I clinical trial, safety and immunogenicity can be evaluated in patients. Sometimes, the patients are healthy volunteers. In some instances, 2, 3, 4, 5, 6, 7,8 , 9, 10, 11, 12, 13, 14, 15, or more patients are evaluated during a Phase I clinical trial. In some cases, safety, dosage, and immune response are established and evaluated. In some instances, the Phase I clinical trial lasts from about 4 to about 36 months, from about 6 to about 28 months, from about 10 to about 26 months, or from about 12 to about 24 months. Sometimes, the Phase I clinical trial lasts about 12 months. [0262] The Phase II clinical trial can evaluate efficacy in patients. Sometimes the patients are healthy volunteers. Other times, the patients are already infected with an influenza A virus. In some cases, the patients are first immunized with composition, e.g., a vaccine as provided herein and then challenged with an influenza A virus. In some instances, data that associated with whether the composition, e.g., vaccine, conveys protection are collected and analyzed. Sometimes, 4, 6, 8, 10, 12, 14, 16, 18, 20, 24, 26, 28, 30, or more patients are evaluated during the Phase II clinical trial. In some instances, the Phase II clinical trial lasts from about 4 to about 36 months, from about 6 to about 28 months, from about 10 to about 26 months, or from about 12 to about 24 months. Sometimes, the Phase II clinical trial lasts about 12 months. In some instances, the Phase II clinical trials can further be divided into Phase IIA and Phase IIB clinical trials.

[0263] 4. VACCINE COMPOSITIONS [0264] A vaccine can be any substance used to stimulate the production of antibodies and provide immunity against one or more diseases. Vaccines may be prepared from live attenuated pathogens, or inactivated pathogens that have been inactivated by e.g., chemicals, heat, or radiation. Vaccines can contain subunits or portions of a pathogen, in which these subunits can be optionally conjugated. Vaccines can also be prepared as a peptide -based vaccine, a nucleic acid-based vaccine, an antibody based vaccine, or an antigen-presenting cell based vaccine [0265] A vaccine can be a traditional vaccine or a universal vaccine. A traditional vaccine can be a vaccine that can target a specific pathogen. Measles vaccine is one example of a traditional vaccine. It can target epitopes present on the hemagglutinin (H) protein of the Measles virus that have remained conserved over 50 years. [0266] Seasonal vaccines can be another type of traditional vaccine. For example, an influenza vaccine is modified annually and is tailored to the population of influenza viruses present at a given year. In some cases, an influenza vaccine is generated as a trivalent vaccine, which can include two subtypes of the influenza A virus, H1N1 and H3N2, and one strain of the influenza B virus. Sometimes, the influenza vaccine is generated as a quadrivalent vaccine, which can include two subtypes of influenza A virus and two strains of influenza B virus. The specific strains of the influenza A and B viruses can be chosen based on surveillance-based forecasts that can predict the pathogenicity of the circulating strains each year and can vary from country to country. [0267] A universal vaccine can be a vaccine that offers broad-based protection against multiple strains of a pathogen, and/or against multiple pathogens within the same family. Exemplary universal vaccines include SynCon® influenza vaccines from Inovio Pharmaceuticals, M-001 from BiondVax, and FP-01 from Immune Targeting Systems. These universal vaccines can target conserved regions or epitopes that exist within the influenza viral proteins. Conserved regions or epitopes can exhibit at least 70%, 80%, 90%>, 95%, 99% sequence homology or sequence identity. [0268] Vaccine compositions can be formulated using one or more physiologically acceptable carriers including excipients and auxiliaries which facilitate processing of one or more active agents, such as one or more peptides, nucleic acids, proteins (e.g., antibodies or fragments thereof), APCs, or viruses described herein, into preparations which can be used pharmaceutically. Proper formulation can be dependent upon the route of administration chosen.

[0269] In some cases, the vaccine composition is formulated as a peptide-based vaccine, a nucleic acid-based vaccine, an antibody based vaccine, a cell based vaccine, or a virus-based vaccine. For example, a vaccine composition can include naked cDNA in cationic lipid formulations; lipopeptides (e.g., Vitiello, A. et al, J. Clin. Invest. 95:341, 1995), naked cDNA or peptides, encapsulated e.g., in poly(DL-lactide-co-glycolide) ("PLG") microspheres (see, e.g., Eldridge, et al, Molec. Immunol. 28:287- 294, 1991: Alonso et al, Vaccine 12:299-306, 1994; Jones et al, Vaccine 13:675-681, 1995); peptide composition contained in immune stimulating complexes (ISCOMS) (see, e.g., Takahashi et al, Nature 344:873-875, 1990; Hu et al, Clin Exp Immunol. 113:235-243, 1998); or multiple antigen peptide systems (MAPs) (see e.g., Tarn, J. P., Proc. Natl Acad. Sci. U.S.A. 85:5409-5413, 1988; Tarn, J.P., J. Immunol. Methods 196:17-32, 1996). Sometimes, a vaccine is formulated as a peptide-based vaccine, or nucleic acid based vaccine in which the nucleic acid encodes the peptides. Sometimes, a vaccine is formulated as an antibody based vaccine. Sometimes, a vaccine is formulated as a cell based vaccine.

[0270] 4a. Peptide-Based Vaccine [0271] Peptide-based vaccine can be formulated using techniques, carriers, and excipients as suitable. In some embodiments, the peptide-based vaccines are formulated to improve their biological half-life, stability, efficacy, bioavailability, bioactivity, or a combination thereof. [0272] Sometimes, a vaccine can comprises a cocktail of multiple peptides described herein containing the same sequence, or a cocktail of multiple copies of different peptides described herein. The peptides can be modified, such as by lipidation, or attachment to a carrier protein. Lipidation can be the covalent attachment of a lipid group to a peptide. Lipidated peptides can stabilize structures and can enhance efficacy of the vaccine treatment. [0273] Lipidation can be classified into several different types, such as N-myristoylation, palmitoylation, GPI-anchor addition, prenylation, and several additional types of modifications. N-myristoylation can be the covalent attachment of myristate, a C14 saturated acid, to a glycine residue. Palmitoylation can be thioester linkage of long-chain fatty acids (CI 6) to cysteine residues. GPI-anchor addition can be glycosyl-phosphatidylinositol (GPI) linkage via amide bond. Prenylation can be the thioether linkage of an isoprenoid lipid (e.g., farnesyl (C-15), geranylgeranyl (C-20)) to cysteine residues. Additional types of modifications can include attachment of S-diacylglycerol by a sulfur atom of cysteines, O-octanoyl conjugation via serine or threonine residues, S-archaeol conjugation to cysteine residues, and cholesterol attachment. [0274] Fatty acids for generating a lipidated peptide can include C2 to C30 saturated, monounsaturated, or polyunsaturated fatty acyl groups. Exemplary fatty acids can include palmitoyl, myristoyl, stearoyl, and decanoyl groups. [0275] In some embodiments, a lipid moiety that has adjuvant property is attached to a peptide of interest to elicit or enhance immunogenicity in the absence of an extrinsic adjuvant. A lipidated peptide or lipopeptide can be referred to as a self-adjuvant lipopeptide. [0276] Any of the fatty acids described above and elsewhere herein can elicit or enhance immunogenicity of a peptide of interest. A fatty acid that can elicit or enhance immunogenicity can include palmitoyl, myristoyl, stearoyl, lauroyl, octanoyl, and decanoyl groups. In some cases, a fatty acid that can elicit or enhance immunogenicity can include palmitoyl groups. Non- limiting examples of palmitoyl group include Pam2Cys, Pam3Cys, or Pam3OH.

[0277] Pam2Cys, also known as dipalmitoyl-S-glyceryl-cysteine or S-[2, 3 bis(palmitoyloxy) propyl]cysteine, corresponds to the lipid moiety of MALP-2, a macrophage-activating lipopeptide isolated from Mycoplasma fermentans.

[0278] Pam3Cys, also known as Pam3OH or N-palmitoyl-S-[2,3- bis(palmitoyloxy)propyl]cysteine, is a synthetic version of the N-terminal moiety of Braun's lipoprotein that spans the inner and outer membranes of Gram negative bacteria.

[0279] Other fatty acid groups contemplated for use include Set2Cys (also known as S-(2,3- bis(stearoyloxy)propyl) cysteine or distearoyl-5-glyceryl-cysteine), Lau2Cys (also known as S- [2,3-bis(lauroyloxy) propyl]cysteine or dilauroyl-S-glyceryl-cysteine); and Oct2Cys (also known as S-[2,3-bis(octanoyloxy)propyl]cysteine or dioctanoyl-S-glyceryl-cysteine). [0280] Additional suitable fatty acid groups include synthetic triacylated and diacylated lipopeptides, FSL-I (a synthetic lipoprotein derived from Mycoplasma salivarium I), Pam3Cys (tripaltnitoyl-S-glyceryl cysteine) and S-[2,3-bis(palmitoyloxy)-(2RS)-propyl]-N-palmitoyl-(R)- cysteine, where "Pam3" is "tripalmitoyl-S-glyceryl". Derivatives of Pam3Cys are also suitable for use, in which derivatives include S-[2,3-bis(palmitoyloxy)-(2-R,S)-propyl]-N-palmitoyl-(R)-

Cys-(S)-Ser-(Lys)4-hydroxytrihydrochloride; Pam3Cys-Ser-Ser-Asn-Ala; PaM3Cys-Ser-(Lys)4;

Pam3Cys-Ala-Gly; PamsCys-Ser-Gly; Pam3Cys-Ser; PaM3CyS-OMe; Pam3Cys-OH; PamCAG, palmitoyl-Cys((RS)-2,3-di(palmitoyloxy)-propyl)-Ala-Gly-OH; and the like. Another non- limiting examples include Pam2CSK4 (dipalmitoyl-S-glyceryl cysteine-serine-(lysine) 4; or

Pam2Cys-Ser-(Lys)4), which is a synthetic diacylated lipopeptide. Further suitable fatty acid groups include those described, e.g., in Kellner et al. (1992, Biol. Chem. 373:1:51-5); Seifer et al. (1990, Biochem. J 26:795-802) and Lee et al. (2003, J. Lipid Res., 44:479-486). [0281] Peptides such as naked peptides or lipidated peptides can be incorporated into a liposome. For example, the lipid portion of the lipidated peptide can spontaneously integrate into the lipid bilayer of a liposome. Thus, a lipopeptide can be presented on the "surface" of a liposome. A lipidated peptide can be a peptide that is encapsulated within a liposome. [0282] Exemplary liposomes suitable for incorporation in the formulations include, and are not limited to, multilamellar vesicles (MLV), oligolamellar vesicles (OLV), unilamellar vesicles (UV), small unilamellar vesicles (SUV), medium-sized unilamellar vesicles (MUV), large unilamellar vesicles (LUV), giant unilamellar vesicles (GUV), multivesicular vesicles (MVV), single or oligolamellar vesicles made by reverse-phase evaporation method (REV), multilamellar vesicles made by the reverse-phase evaporation method (MLV-REV), stable plurilamellar vesicles (SPLV), frozen and thawed MLV (FATMLV), vesicles prepared by extrusion methods (VET), vesicles prepared by French press (FPV), vesicles prepared by fusion (FUV), dehydration-rehydration vesicles (DRV), and bubblesomes (BSV). Techniques for preparing liposomes are described in, for example, COLLOIDAL DRUG DELIVERY SYSTEMS, vol. 66 (J. Kreuter ed., Marcel Dekker, Inc. (1994)). [0283] Depending on the method of preparation, liposomes can be unilamellar or multilamellar, and can vary in size with diameters ranging from about 0.02 µιη to greater than about 10 µιη. Sometimes, the liposomes can be small unilamellar vesicles (25-50 nm), large unilamellar vesicles (100-200 nm), giant unilamellar vesicles (1-2 µιη), and multilamellar vesicles (MLV; 1 µιη-2 µιη). The peptides being delivered can be either encapsulated into liposomes or adsorbed on the surface. The size and surface properties of liposomes may be optimized for a desired result. For example, unilamellar and multilamellar liposomes provide sustained release from several hours to days after intravascular administration. The prolonged drug release can be achieved by multivesicular liposomes, also known as DepoFoam® technology. Unlike ULV and MLV, multivesicular liposomes are composed of nonconcentric multiple aqueous chambers surrounded by a network of lipid layers which confers an increased level of stability and longer duration of drug release. The liposomes may be further modified to achieve a desired result. For example, the liposomes may be PEGylated or have other surface modifications in order to interfere with recognition and uptake by the reticuloendothelial system and provide increased circulation times. [0284] Liposomes can adsorb many types of cells and then release an incorporated agent (e.g., a peptide described herein). In some cases, the liposomes fuse with the target cell, whereby the contents of the liposome then empty into the target cell. A liposome can be endocytosed by cells that are phagocytic. Endocytosis can be followed by intralysosomal degradation of liposomal lipids and release of the encapsulated agents. Scherphof et al, Ann. N.Y Acad. Sci., 446: 368 (1985). [0285] The liposomes provided herein can also comprise carrier lipids. In some embodiments the carrier lipids are phospholipids. Carrier lipids capable of forming liposomes include, but are not limited to dipalmitoylphosphatidylcholme (DPPC), phosphatidylcholine (PC; lecithin), phosphatidic acid (PA), phosphatidylglycerol (PG), phosphatidylethanolamine (PE), phosphatidylserine (PS). Other suitable phospholipids further include distearoylphosphatidylcholine (DSPC), dimyristoylphosphatidylcholine (DMPC), dipalmitoylphosphatidyglycerol (DPPG), distearoylphosphatidyglycerol (DSPG), dimyristoylphosphatidylglycerol (DMPG), dipalmitoylphosphatidic acid (DPPA); dimyristoylphosphatidic acid (DMPA), distearoylphosphatidic acid (DSPA), dipalmitoylphosphatidylserine (DPPS), dimyristoylphosphatidylserine (DMPS), distearoylphosphatidylserine (DSPS), dipalmitoylphosphatidyethanolamine (DPPE), dimyristoylphosphatidylethanolamine (DMPE), distearoylphosphatidylethanolamine (DSPE) and the like, or combinations thereof. In some embodiments, the liposomes further comprise a sterol (e.g., cholesterol) which modulates liposome formation. The carrier lipids can be any known non-phosphate polar lipids. [0286] A peptide can also be attached to a carrier protein for delivery as a vaccine. The carrier protein can be an immunogenic carrier element and can be attached by any recombinant technology. Exemplary carrier proteins include Mariculture keyhole limpet hemocyanin (mcKLH), PEGylated mcKLH, Blue Carrier* Proteins, bovine serum albumin (BSA), cationized BSA, ovalbumin, and bacterial proteins such as tetanus toxoid (TT). [0287] A peptide can also be prepared as multiple antigenic peptides (MAPs). Peptides can be attached at the N-terminus or the C-terminus to small non-immunogenic cores. Peptides built upon this core can offer highly localized peptide density. The core can be a dendritic core residue or matrix composed of bifunctional units. Suitable core molecules for constructing MAPs can include ammonia, ethylenediamine, aspartic acid, glutamic acid, and lysine. For example, a lysine core molecule can be attached via peptide bonds through each of its amino groups to two additional lysines. This second generation molecule has four free amino groups, each of which can be covalently linked to an additional lysine to form a third generation molecule with 8 free amino groups. A peptide may be attached via its C-terminus to each of these free groups to form an octavalent multiple antigenic peptide (also referred to as a "MAP8" structure). The second generation molecule having four free amino groups can be used to form a tetravalent or tetrameric MAP, e.g., a MAP having four peptides covalently linked to the core (also referred to as a "MAP4" structure). The carboxyl group of the first lysine residue can be left free, amidated, or coupled to β-alanine or another blocking compound. As used herein, the "linear portion or molecule" of a MAP system structure can refer to antigenic peptides that are linked to the core matrix. Thus, a cluster of antigenic epitopes can form the surface of a MAP and a small matrix forms its core. The dendritic core, and the entire MAP can be synthesized on a solid resin using a classic Merrifield synthesis procedure. MAP synthesis is generally described, for example, in U.S. Pat. Nos. 5,580,563, and 6,379,679, and Tarn, Proc. Natl. Acad. Sci. USA 85:5409-5413, 1988. [0288] The peptides used for MAP preparation can be identical or can comprise multiple different sequences and lengths. The peptides can be derived from a bacterium, a virus, or a fungus. The peptides can be derived from a virus, such as influenza A virus, influenza B virus, influenza C virus, hepatitis B virus, hepatitis C virus, or HIV. [0289] Sometimes, a peptide can be subjected to cyclization to result in a cyclic peptide which is resistant to proteolytic degradation. Cyclization can be carried out between side chains or ends of the peptide sequences through disulfide bonds, lanthionine, dicarba, hydrazine, or lactam bridges using methods known in the art.

[0290] In some embodiments, the peptides are conjugated to a molecule such as vitamin B12, a lipid, or an ethylene oxide compound, e.g., polyethylene glycol (PEG), polyethylene oxide (PEO), and polyoxyethylene (POE), methoxypolyethylene glycol (MPEG), mono-methoxy PEG (mPEG), and the like. The ethylene oxide compound can be further functionalized with, for example, amine binding terminal functional groups such as N-hydroxysuccinimide esters, N- hydroxysuccinimide carbonates, and aliphatic aldehyde, or thiol binding groups such as maleimide, pyridyl disulphides, and vinyl sulfonates. Since amino groups (a-amino and ε-lysine amino) and cysteine residues are well suited for conjugation, the peptides according to the present invention can further include one or more amino acid residues for conjugation to an ethylene oxide molecule or a carrier compound known in the art. The pharmacokinetic and pharmacodynamic properties of a conjugated peptide can be further modified by the use of a particular linker. For example, propyl and amyl linkers can be used to provide a conjugate having a loose conformation whereas a phenyl linker may be used to provide a denser conformation as well as shield domains adjacent to the C-terminus. In some instances, dense conformations can be more efficient in maintaining bioactivity, prolonging plasma half-life, lowering proteolytic sensitivity, and immunogenicity relative to loose conformations. [0291] In some embodiments, the peptides are hyperglycosylated using methods known in the art, e.g., in situ chemical reactions or site-directed mutagenesis. Hyperglycosylation can result in either N-linked or O-linked protein glycosylation. The clearance rate of a given peptide can be optimized by the selection of the particular saccharide. For example, polysialic acid (PSA) is available in different sizes and its clearance depends on type and molecular size of the polymer. Thus, for example, PSAs having high molecular weights can be suitable for the delivery of low-molecular- weight peptides, and PSAs having low molecular weights can be suitable for the delivery of peptides having high molecular weights. The type of saccharide can be used to target the peptide to a particular tissue or cell. For example, peptides conjugated with mannose can be recognized by mannose-specific lectins, e.g., mannose receptors and mannose binding proteins, and are taken up by the liver. In some embodiments, the peptides are hyperglycosylated to improve their physical and chemical stability under different environmental conditions, e.g., to inhibit inactivation under stress conditions and reduce aggregation resulting from production and storage conditions. [0292] In some embodiments, a drug delivery system, such as microparticles, nanoparticles (particles having sizes ranging from 10 to 1000 nm), nanoemulsions, liposomes, and the like, is used to provide protection of sensitive proteins, prolong release, reduce administration frequency, increase patient compliance, and control plasma levels. Various natural or synthetic microparticles and nanoparticles, which can be biodegradable and/or biocompatible polymers, can be used. Microparticles and nanoparticles can be fabricated from lipids, polymers, and/or metal. Polymeric microparticles and nanoparticles can be fabricated from natural or synthetic polymers, such as starch, alginate, collagen, chitosan, polycaprolactones (PCL), polylactic acid (PLA), poly (lactide-co-glycolide) (PLGA), and the like. In some embodiments, the nanoparticles are solid lipid nanoparticles (SLNs), carbon nanotubes, nanospheres, nanocapules, and the like. In some embodiments, the polymers are hydrophilic. In some embodiments, the polymers are thiolated polymers. [0293] Since the rate and extent of drug release from microparticles and nanoparticles can depend on the composition of polymer and fabrication methods one can select a given composition and fabrication method, e.g., spray drying, lyophilization, microextrusion, and double emulsion, to confer a desired drug release profile. Since peptides incorporated in or on microparticles or nanoparticles can be prone to denaturation at aqueous-organic interface during formulation development, different stabilizing excipients and compositions can be used to prevent aggregation and denaturation. For example, PEG and sugars, e.g., PEG (MW 5000) and maltose with a-chymotrypsin, can be added to the composition to reduce aggregation and denaturation. Additionally, chemically modified peptides, e.g., conjugated peptides and hyperglycosylated peptides, can be employed. [0294] Protein stability can also be achieved by the selected fabrication method. For example, to prevent degradation at aqueous-organic interface, non-aqueous methodology called ProLease® technology may be used. Peptides in solid state can also be encapsulated using solid-in-oil-in-water (s/o/w) methods, e.g., spray- or spray-freeze-dried peptides or peptide - loaded solid nanoparticles can be encapsulated in microspheres using s/o/w methods. Hydrophobic ion-pairing (HIP) complexation may be used to enhance protein stability and increase encapsulation efficiency into microparticles and nanoparticles. In hydrophobic ion- pairing (HIP) complexation, ionizable functional groups of a peptide are complexed with ion- pairing agents (e.g., surfactant or polymer) containing oppositely charged functional groups leading to formation of HIP complex where hydrophilic protein molecules exist in a hydrophobic complex form. [0295] A peptide described herein can be chemically synthesized, or recombinantly expressed in a cell system or a cell-free system. A peptide can be synthesized, such as by a liquid-phase synthesis, a solid-phase synthesis, or by microwave assisted peptide synthesis. A peptide can be modified, such as by acylation, alkylation, amidation, arginylation, polyglutamylation, polyglycylation, butyrylation, gamma-carboxylation, glycosylation, malonylation, hydroxylation, iodination, nucleotide addition (e.g., ADP-ribosylation), oxidation, phosphorylation, adenylylation, propionylation, S-glutathionylation, S-nitrosylation, succinylation, sulfation, glycation, palmitoylation, myristoylation, isoprenylation or prenylation (e.g., farnesylation or geranylgeranylation), glypiation, lipoylation, attachement of flavin moiety (e.g., FMN or FAD), attachment of heme C, phosphopantetheinylation, retinylidene Schiff base formation, diphthamide formation, ethanolamine phosphoglycerol attachment, hypusine formuation, biotinylation, pegylation, ISGylation, SUMUylation, ubiquitination, Neddylation, Pupylation, citrullination, deamidation, eliminylation, carbamylation, or a combination thereof.

[0296] After generation of a peptide, the peptide can be subjected to one or more rounds of purification steps to remove impurities. The purification step can be a chromatographic step utilizing separation methods such as affinity-based, size-exclusion based, ion-exchange based, or the like. In some cases, the peptide is at most 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% , 99.9% , or 100% pure or without the presence of impurities. In some cases, the peptide is at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, 99.9%, or 100% pure or without the presence of impurities. In some cases, the amount of the peptides in the peptide composition is at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, 99.9%, or 100% by weight of the total composition.

[0297] A peptide provided herein can comprise one or more natural amino acids, unnatural amino acids, or a combination thereof. An amino acid residue can be a molecule containing both an amino group and a carboxyl group. Suitable amino acids for use in the peptides described include, without limitation, both the D- and L-isomers (amino acid isomer) of the naturally-occurring amino acids, as well as non-naturally occurring amino acids prepared by organic synthesis or other metabolic routes. An amino acid can be an a-amino acid, β-amino acid, natural amino acid, non-natural amino acid, or amino acid analog. An a-amino acid can be molecule containing both an amino group and a carboxyl group bound to a carbon which is designated the a-carbon. A β-amino acid can be a molecule containing both an amino group and a carboxyl group in a β configuration. A naturally occurring amino acid can be any one of the twenty amino acids commonly found in peptides synthesized in nature, and known by the one letter abbreviations A, R, N, C, D, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y and V. A table showing a summary of the properties of natural amino acids can be found, e.g., in U.S. Patent Application Publication No. 20130123169, which is herein incorporated by reference. [0298] A peptide provided herein can comprise one or more hydrophobic, hydrophilic, polar, or charged amino acids. A hydrophobic amino acid can include small a hydrophobic amino acids and large hydrophobic amino acids. A small hydrophobic amino acid can be glycine, alanine, proline, and analogs and isomers thereof. A large hydrophobic amino acid can be a valine, leucine, isoleucine, phenylalanine, methionine, tryptophan, and analogs and isomers thereof. A polar amino acid can be a serine, threonine, asparagine, glutamine, cysteine, tyrosine, and analogs and isomers thereof. A charged amino acid can be a lysine, arginine, histidine, aspartate, glutamate, or analog thereof. [0299] A peptide as provided herein can comprise one or more amino acid analogs. An amino acid analog can be a molecule which is structurally similar to an amino acid and which can be substituted for an amino acid in the formation of a peptidomimetic macrocycle. Amino acid analogs include β-amino acids and amino acids where the amino or carboxy group is substituted by a similarly reactive group (e.g., substitution of the primary amine with a secondary or tertiary amine, or substitution of the carboxy group with an ester). [0300] A peptide provided herein can comprise one or more non-natural amino acids. A non-natural amino acid can be an amino acid which is not one of the twenty amino acids commonly found in peptides synthesized in nature, and known by the one letter abbreviations A, R, N, C, D , Q, E, G, H, I, L, K, M, F, P, S, T, W, Y and V. Non-natural amino acids or amino acid analogs include structures disclosed, e.g., in U.S. Patent Application Publication No. 20130123169, which is herein incorporated by reference. [0301] Amino acid analogs can include β-amino acid analogs. Amino acid analogs can include analogs of alanine, valine, glycine, leucine, arginine, lysine, aspartic acids, glutamic acids, cysteine, methionine, phenylalanine, tyrosine, proline, serine, threonine, and/or tryptophan. Examples of β-amino acid analogs and analogs of alanine, valine, glycine, leucine, arginine, lysine, aspartic acids, glutamic acids, cysteine, methionine, phenylalanine, tyrosine, proline, serine, threonine, and tryptophan can include structures disclosed, e.g., in U.S. Patent Application Publication No. 20130123169, which is herein incorporated by reference. [0302] Amino acid analogs can be racemic. In some embodiments, the D isomer of the amino acid analog is used. In some cases, the L isomer of the amino acid analog is used. In some embodiments, the amino acid analog comprises chiral centers that are in the R or S configuration. Sometimes, the amino group(s) of a β-amino acid analog is substituted with a protecting group, e.g., tert-butyloxycarbonyl (BOC group), 9-fluorenylmethyloxycarbonyl (FMOC), tosyl, and the like. Sometimes, the carboxylic acid functional group of a β-amino acid analog is protected, e.g., as its ester derivative. In some cases, the salt of the amino acid analog is used. [0303] A peptide provided herein can comprise a non-essential amino acid. A non-essential amino acid residue can be a residue that can be altered from the wild-type sequence of a peptide without abolishing or substantially altering its essential biological or biochemical activity (e.g., receptor binding or activation). A peptide provided herein can comprise an essential amino acid. An essential amino acid residue can be a residue that, when altered from the wild-type sequence of the peptide, results in abolishing or substantially abolishing the peptide's essential biological or biochemical activity. [0304] A peptide provided herein can comprise a conservative amino acid substitution. A conservative amino acid substitution can be one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families can include amino acids with basic side chains (e.g., K, R, H), acidic side chains (e.g., D, E), uncharged polar side chains (e.g., G, N, Q, S, T, Y, C), nonpolar side chains (e.g., A, V, L, I, P, F, M, W), beta-branched side chains (e.g., T, V, I) and aromatic side chains (e.g., Y, F, W, H). Thus, a predicted nonessential amino acid residue in a peptide, for example, can be replaced with another amino acid residue from the same side chain family. Other examples of acceptable substitutions can be substitutions based on isosteric considerations (e.g., norleucine for methionine) or other properties (e.g., 2-thienylalanine for phenylalanine, or 6-Cl-tryptophan for tryptophan). [0305] In some embodiments, a peptide provided herein is connected to a CD4+ (helper) T cell epitope. A "connection" can be, e.g., a direct or indirect covalent linkage, or a direct or indirect non-covalent linkage with a peptide. In some embodiments, the CD4+ (helper) T cell epitope is ISQAVHAAHAEINEAGR (SEQ ID NO: 119) (see related sequence in, e.g., Amabel et al. (201 1) Precursor Frequency and Competition Dictate the HLA-A2-Restricted CD*+ T Cell Responses to Influenza A Infection and Vaccination in HLA-A2.I TransgenicMice The Journal of Immunology 187(4): 1895-902, which is herein incorporated by reference in its entirety). In some cases, the CD4+ (helper) T cell epitope is AKFVAAWTLKAAA (HLA DR-binding Epitope, PADRE) (SEQ ID NO: 148), or a non-natural amino acid derivative of the PADRE sequence, AKXVAAWTLKAAAZC (SEQ ID NO: 149), wherein X is L-cyclohexylalanine and Z is aminocaproic acid. [0306] In some embodiments, the C-terminus of peptide provided herein, e.g., a peptide comprising a sequence selected from SEQ ID NOs: 1-68 is attached is a lysine and the lysine is attached to the N-terminus of a CD4+ T cell epitope. In some embodiments, the C-terminus of a CD4+ (helper) T cell epitope is attached to a lysine and the lysine is attached to the N-terminus of a peptide comprising a sequence selected from SEQ ID NOs: 1-68.

[0307] 4b. Nucleic Acid-Based Vaccine [0308] A nucleic acid-based vaccine can be formulated using techniques, carriers, and excipients as suitable. The nucleic acid can be DNA, both genomic and cDNA, RNA, or a hybrid, where the nucleic acid can contain combinations of deoxyribo- and ribo-nucleotides, and combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine and isoguanine. Nucleic acids can be obtained by chemical synthesis methods or by recombinant methods. The vaccine can be a DNA-based vaccine, an RNA-based vaccine, a hybrid DNA/RNA based vaccine, or a hybrid nucleic acid/peptide based vaccine. The

peptide can be a peptide that has a sequence with at least 40%, 50%, 60%>, 70%>, 80%>, 90%>, or

95% sequence homology to a peptide selected from the group consisting of SEQ ID NOs: 1-68.

The peptide can be a peptide that has a sequence with at most 40%>, 50%>, 60%>, 70%>, 80%>, 90%>,

or 95% sequence homology to a peptide selected from the group consisting of SEQ ID NOs: 1- 68.

[0309] Nucleic acid molecules can refer to at least two nucleotides covalently linked together. A nucleic acid described herein can contain phosphodiester bonds, although in some cases, as outlined below (for example in the construction of primers and probes such as label probes), nucleic acid analogs are included that can have alternate backbones, comprising, for example, phosphoramide (Beaucage et al., Tetrahedron 49(10): 1925 (1993) and references therein; Letsinger, J. Org. Chem. 35:3800 (1970); Sprinzl et al, Eur. J. Biochem. 81:579 (1977); Letsinger et al, Nucl. Acids Res. 14:3487 (1986); Sawai et al, Chem. Lett. 805 (1984), Letsinger et al, J. Am. Chem. Soc. 110:4470 (1988); and Pauwels et al, Chemica Scripta 26:141 91986)), phosphorothioate (Mag et al, Nucleic Acids Res. 19:1437 (1991); and U.S. Pat. No. 5,644,048), phosphorodithioate (Briu et al., J. Am. Chem. Soc. 111:2321 (1989), O-methylphosphoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: A Practical Approach, Oxford University Press), and peptide nucleic acid (also referred to herein as "PNA") backbones and linkages (see Egholm, J. Am. Chem. Soc. 114:1895 (1992); Meier et al, Chem. Int. Ed. Engl. 31:1008 (1992); Nielsen, Nature, 365:566 (1993); Carlsson et al, Nature 380:207 (1996), all of which are incorporated by reference). Other analog nucleic acids include those with bicyclic structures including locked nucleic acids (also referred to herein as "LNA"), Koshkin et al, J. Am. Chem. Soc. 120.13252 3 (1998); positive backbones (Denpcy et al, Proc. Natl. Acad. Sci. USA 92:6097 (1995); non-ionic backbones (U.S. Pat. Nos. 5,386,023, 5,637,684, 5,602,240, 5,216,141 and 4,469,863; Kiedrowshi et al, Angew. Chem. Intl. Ed. English 30:423 (1991); Letsinger et al, J. Am. Chem. Soc. 110:4470 (1988); Letsinger et al, Nucleoside & Nucleotide 13:1597 (1994); Chapters 2 and 3, ASC Symposium Series 580, "Carbohydrate Modifications in Antisense Research", Ed. Y. S. Sanghui and P. Dan Cook; Mesmaeker et al., Bioorganic & Medicinal Chem. Lett. 4:395 (1994); Jeffs et al, J. Biomolecular NMR 34:17 (1994); Tetrahedron Lett. 37:743 (1996)) and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, "Carbohydrate Modifications in Antisense Research", Ed. Y. S. Sanghui and P. Dan Cook. Nucleic acids containing one or more carbocyclic sugars are also included within the definition of nucleic acids (see Jenkins et al, Chem. Soc. Rev. (1995) pp 169 176). Several nucleic acid analogs are described in Rawls, C & E News Jun. 2, 1997 page 35. "Locked nucleic acids" are also included within the definition of nucleic acid analogs. LNAs are a class of nucleic acid analogues in which the ribose ring is "locked" by a methylene bridge connecting the 2'-0 atom with the 4'-C atom. All of these references are hereby expressly incorporated by reference. These modifications of the ribose-phosphate backbone can be done to increase the stability and half-life of such molecules in physiological environments. For example, PNA:DNA and LNA- DNA hybrids can exhibit higher stability and thus can be used in some embodiments. The target nucleic acids can be single stranded or double stranded, as specified, or contain portions of both double stranded or single stranded sequence. Depending on the application, the nucleic acids can be DNA (including, e.g., genomic DNA, mitochondrial DNA, and cDNA), RNA (including, e.g., mRNA and rRNA) or a hybrid, where the nucleic acid contains any combination of deoxyribo- and ribo-nucleotides, and any combination of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xathanine hypoxathanine, isocytosine, isoguanine, etc. [0310] The vector can be a circular plasmid or a linear nucleic acid. The circular plasmid or linear nucleic acid can be capable of directing expression of a particular nucleotide sequence in an appropriate subject cell. The vector can have a promoter operably linked to the peptide - encoding nucleotide sequence, which can be operably linked to termination signals. The vector can also contain sequences required for proper translation of the nucleotide sequence. The vector comprising the nucleotide sequence of interest can be chimeric, meaning that at least one of its components is heterologous with respect to at least one of its other components. The expression of the nucleotide sequence in the expression cassette can be under the control of a constitutive promoter or of an inducible promoter, which can initiate transcription only when the host cell is exposed to some particular external stimulus. [0311] The vector can be a plasmid. The plasmid can be useful for transfecting cells with nucleic acid encoding the peptide, which the transformed host cells can be cultured and maintained under conditions wherein expression of the peptide takes place. [0312] The plasmid can comprise a nucleic acid sequence that encodes one or more of the various peptides disclosed herein. A single plasmid can contain coding sequence for a single peptide, or coding sequence for more than one peptide. Sometimes, the plasmid can further comprise coding sequence that encodes an adjuvant, such as an immune stimulating molecule, such as a cytokine. [0313] The plasmid can further comprise an initiation codon, which can be upstream of the coding sequence, and a stop codon, which can be downstream of the coding sequence. The initiation and termination codon can be in frame with the coding sequence. The plasmid can also comprise a promoter that is operably linked to the coding sequence, and an enhancer upstream of the coding sequence. The enhancer can be human actin, human myosin, human hemoglobin, human muscle creatine, or a viral enhancer such as one from CMV, FMDV, RSV, or EBV. Polynucleotide function enhancers are described in U.S. Patent Nos. 5,593,972, 5,962,428, and WO 94/016737. [0314] The plasmid can also comprise a mammalian origin of replication in order to maintain the plasmid extrachromosomally and produce multiple copies of the plasmid in a cell. The plasmid can be pVAXI, pCEP4, or pREP4 from Invitrogen (San Diego, CA). [0315] The plasmid can also comprise a regulatory sequence, which may be well suited for gene expression in a cell into which the plasmid is administered. The coding sequence can comprise a codon that can allow more efficient transcription of the coding sequence in the host cell. [0316] The plasmid can be pSE420 (Invitrogen, San Diego, CA), which can be used for protein production in Escherichia coli (E.coli). The plasmid can also be pYES2 (Invitrogen, San Diego, CA), which can be used for protein production in Saccharomyces cerevisiae strains of yeast. The plasmid can also be of the MAXBAC™ complete baculovirus expression system (Invitrogen, San Diego, CA), which can be used for protein production in insect cells. The plasmid can also be pcDNA I or pcDNA3 (Invitrogen, San Diego, CA), which can be used for protein production in mammalian cells such as Chinese hamster ovary (CHO) cells. [0317] The vector can be circular plasmid, which can transform a target cell by integration into the cellular genome or exist extrachromosomally (e.g., autonomous replicating plasmid with an origin of replication). Exemplary vectors include pVAX, pcDNA3.0, or provax, or any other expression vector capable of expressing DNA encoding the antigen and enabling a cell to translate the sequence to an antigen that is recognized by the immune system. [0318] The nucleic acid based vaccine can also be a linear nucleic acid vaccine, or linear expression cassette ("LEC"), that is capable of being efficiently delivered to a subject via electroporation and expressing one or more peptides disclosed herein. The LEC can be any linear DNA devoid of any phosphate backbone. The DNA can encode one or more peptides disclosed herein. The LEC can contain a promoter, an intron, a stop codon, and/or a polyadenylation signal. The expression of the peptide can be controlled by the promoter. The LEC can not contain any antibiotic resistance genes and/or a phosphate backbone. The LEC cannot contain other nucleic acid sequences unrelated to the peptide expression. [0319] The LEC can be derived from any plasmid capable of being linearized. The plasmid can express the peptide. Exemplary plasmids include: pNP (Puerto Rico/34), pM2 (New Caledonia/99), WLV009, pVAX, pcDNA3.0, provax, or any other expression vector capable of expressing DNA encoding the antigen and enabling a cell to translate the sequence to an antigen that is recognized by the immune system. [0320] The nucleic acid based vaccine can be delivered to a subject through a parenteral delivery method. A parenteral delivery can include intravenous, transdermal, oral, intrabiliary, intraparenchymal, intra-hepatic artery, intra-portal vein, intratumoral, or transvenous delivery. Sometimes, a parenteral delivery can utilize a needle (e.g., a hypodermic needle) for delivery of the nucleic acid based vaccine. The nucleic acid based vaccine can be formulated in an aqueous solution, e.g., saline. The delivery can be further assisted by electroporation. Sometimes, a parenteral delivery can utilize a gene gun as a delivery method. The nucleic acid based vaccine can be formulated as a DNA-coated microparticle, e.g., a DNA-coated gold or tungsten bead. The gene gun delivery method can use a ballistical delivery method to accelerate nucleic acid into target cells. Sometimes, a parenteral delivery can utilize a pneumatic injection as a delivery method. The nucleic acid based vaccine can be formulated as an aqueous solution. [0321] The nucleic acid based vaccine can also be delivered to a subject through a topical delivery method. Topical nucleic acid based vaccine can be formulated as aerosol instillation of naked DNA to be delivered onto mucosal surfaces, such as the nasal and lung mucosa, ocular administration, or vaginal mucosa. [0322] The nucleic acid based vaccine can further be delivered to a subject through a lipid- mediated delivery method. Sometimes, the lipid-mediated delivery method can be a cytofectin- mediated delivery method. Cytofectin can be cationic lipids that can bind and transport nucleic acid molecules across cell membranes. The nucleic acid can be incorporated by cytofectin- based liposomes. Sometimes, the lipid-mediated delivery method can be a neutral lipid- mediated delivery method. Sometimes, the method of nucleic acids delivery into a subject utilizing lipids, such as neutral or cationic lipids, can be a method described in U.S. Patent No. 5,279,833 or 5,334,761. [0323] A composition, vaccine, can comprise at least 5, 10, 25, 50, or 100 different nucleic acids.

[0324] 4c. Antibody Based Vaccine [0325] A vaccine can comprise an entity that binds a peptide sequence described herein. The entity can be an antibody. Antibody-based vaccine can be formulated using any techniques, carriers, and excipients as suitable. The antibody can be a natural antibody, a chimeric antibody, a humanized antibody, or can be an antibody fragment. The antibody can recognize one or more of the sequences described herein. The antibody can recognize one or more sequences selected from SEQ ID NOs: 1-68. The antibody can recognize a sequence that has at most 40%, 50%,

60% , 70% , 80% , 90% , or 95% sequence homology to a sequence selected from the group consisting of SEQ ID NOs: 1-68. The antibody can recognize a sequence with at least 40%>,

50% , 60% , 70% , 80% , 90% , or 95% sequence homology to a sequence selected from the group consisting of SEQ ID NOs: 1-68. The antibody can recognize a sequence length at least 30%>, 40%, 50%, 60%, 70%, 80%, 90%, or 95%, or more of a sequence length of a sequence selected from the group consisting of SEQ ID NOs: 1-68. The antibody can a sequence length at most 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95% the sequence length of a sequence selected from the group consisting of SEQ ID NOs: 1-68. The antibody can recognize one or more of the sequences selected from SEQ ID NOs: 1-53. The antibody can recognize a sequence that has at most 40% , 50% , 60% , 70%>, 80%>, 90%>, or 95% sequence homology to a sequence selected from the group consisting of SEQ ID NOs: 1-53. The antibody can recognize a sequence with at least 40%, 50%, 60%, 70%, 80%, 90%, or 95% sequence homology to a sequence selected from the group consisting of SEQ ID NOs: 1-53. The antibody can recognize a sequence length at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95%, or more of a sequence length of a sequence selected from the group consisting of SEQ ID NOs: 1-53. The antibody can a sequence length at most 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95% the sequence length of a sequence selected from the group consisting of SEQ ID NOs: 1-53. [0326] In some embodiments, the antibody recognizes epitopes from multiple strains of a virus, a bacterium, or a fungus described herein. In some embodiments, the antibody recognizes epitopes from multiple strains of a virus, such as a DNA virus or an RNA virus. The antibody can further recognize epitopes from multiple strains of influenza virus, such as influenza A virus, influenza B virus, influenza C virus; multiple strains of hepatitis virus, such as hepatitis B virus, or hepatitis C virus, or multiple strains of HIV, such as HIV-1 and HIV-2. [0327] An antibody can include fully assembled antibodies, antibody fragments that can bind antigen (e.g., Fab, F(ab') 2, Fv, single chain antibodies, diabodies, antibody chimeras, hybrid antibodies, bispecific antibodies, humanized antibodies, and the like), and recombinant peptides comprising the forgoing. [0328] In some embodiments, an antibody is a monoclonal antibody. The preparation of monoclonal antibodies is known in the art and can be accomplished by fusing spleen cells from a host sensitized to the antigen with myeloma cells in accordance with known techniques or by transforming the spleen cells with an appropriate transforming vector to immortalize the cells. The cells can be cultured in a selective medium, cloned, and screened to select monoclonal antibodies that bind the designated antigens. Numerous references can be found on the preparation of monoclonal and polyclonal antibodies (e.g., Kohler and Milstein, (1975) Nature (London) 256, 495-497; Kennet, R., (1980) in Monoclonal Antibodies (Kennet et al, Eds. pp. 365-367, Plenum Press, N.Y.). [0329] A native antibody (native immunoglobulin) can be heterotetrameric glycoproteins of about 150,000 daltons, composed of two identical light (L) chains and two identical heavy (H) chains. Each light chain can be linked to a heavy chain by one covalent disulfide bond, while the number of disulfide linkages can vary among the heavy chains of different immunoglobulin isotypes. Each heavy and light chain can also have regularly spaced intrachain disulfide bridges. Each heavy chain has at one end a variable domain (VH) followed by a number of constant domains. Each light chain can have a variable domain at one end (VL) and a constant domain at its other end; the constant domain of the light chain can be aligned with the first constant domain of the heavy chain, and the light chain variable domain can be aligned with the variable domain of the heavy chain. Particular amino acid residues can form an interface between the light and heavy-chain variable domains. [0330] Variable regions can confer antigen-binding specificity. In some cases, the variability is not evenly distributed throughout the variable domains of antibodies. Variability can be concentrated in three segments called complementarity determining regions (CDRs) or hypervariable regions, both in the light chain and the heavy-chain variable domains. The more highly conserved portions of variable domains can be located in the framework (FR) regions. The variable domains of native heavy and light chains each can comprise four FR regions, largely adopting a β-pleated-sheet configuration, connected by three CDRs, which form loops connecting, and in some cases forming part of, the β-pleated-sheet structure. The CDRs in each chain can be held together in close proximity by the FR regions and, with the CDRs from the other chain, contribute to the formation of the antigen-binding site of antibodies (see, Kabat et al. (1991) NIH PubL. No. 91-3242, Vol. I, pages 647-669). In some cases, the constant domains cannot be involved directly in binding an antibody to an antigen, but can exhibit various effector functions, such as Fc receptor (FcR) binding, participation of the antibody in antibody- dependent cellular toxicity, initiation of complement dependent cytotoxicity, and mast cell degranulation. [0331] A hypervariable region can refer to the amino acid residues of an antibody that are responsible for antigen-binding. The hypervariable region can comprise amino acid residues from a complementarily determining region or CDR (i.e., residues 24-34 (LI), 50-56 (L2), and 89-97 (L3) in the light-chain variable domain and 31-35 (HI), 50-65 (H2), and 95-102 (H3) in the heavy-chain variable domain; Kabat et al. (1991) Sequences of Proteins of Immunological Interest, 5th Ed. Public Health Service, National Institute of Health, Bethesda, Md.) and/or those residues from a "hypervariable loop" (i.e., residues 26-32 (LI), 50-52 (L2), and 91-96 (L3) in the light-chain variable domain and (HI), 53-55 (H2), and 96-101 (13) in the heavy chain variable domain; Clothia and Lesk, (1987) J. Mol. Biol, 196:901-917). Framework or FR residues can be those variable domain residues other than the hypervariable region residues, as herein deemed. [0332] Antibody fragments can comprise a portion of an intact antibody, e.g., the antigen- binding or variable region of the intact antibody. Examples of antibody fragments include Fab, Fab, F(ab')2, and Fv fragments; diabodies; minibodies; linear antibodies (Zapata et al. (1995) Protein Eng. 10:1057-1062); single-chain antibody molecules; and multispecific antibodies formed from antibody fragments. Papain digestion of antibodies can produce two identical antigen-binding fragments, called Fab fragments, each with a single antigen-binding site, and a residual Fc fragment, whose name reflects its ability to crystallize readily. Pepsin treatment yields an F(ab')2 fragment that has two antigen-combining sites and is still capable of cross- linking antigen.

[0333] Fv can be the minimum antibody fragment that contains a complete antigen recognition and binding site. This region can consist of a dimer of one heavy- and one light- chain variable domain in tight, non-covalent association. It is in this configuration that the three CDRs of each variable domain can interact to define an antigen-binding site on the surface of the VH-VL dimer. Collectively, the six CDRs can confer antigen-binding specificity to the antibody. However, even a single variable domain (or half of an Fv comprising only three CDRs specific for an antigen) can have the ability to recognize and bind antigen, although at a lower affinity than the entire binding site. [0334] The Fab fragment can contain the constant domain of the light chain and the first constant domain (CHI) of the heavy chain. Fab fragment can differ from Fab' fragments by the addition of a few residues at the carboxy terminus of the heavy chain CHI domain including one or more cysteines from the antibody hinge region. Fab'-SH can be used herein for Fab' in which the cysteine residue(s) of the constant domains bear a free thiol group. Fab' fragments can be produced by reducing the F(ab')2 fragment's heavy chain disulfide bridge. Other chemical couplings of antibody fragments are also known. [0335] The light chains of antibodies (immunoglobulins) from any vertebrate species can be assigned to one of two clearly distinct types, called kappa (κ) and lambda (λ), based on the amino acid sequences of their constant domains. [0336] Depending on the amino acid sequence of the constant domain of their heavy chains, immunoglobulins can be assigned to different classes. Five major classes of human immunoglobulins include: IgA, IgD, IgE, IgG, and IgM, and several of these may be further divided into subclasses (isotypes), e.g., IgGl, IgG2, IgG3, IgG4, IgAl, and IgA2. The heavy- chain constant domains that correspond to the different classes of immunoglobulins are called alpha, delta, epsilon, gamma, and mu, respectively. The subunit structures and three- dimensional configurations of different classes of immunoglobulins are well known. Different isotypes can have different effector functions. For example, human IgGl and IgG3 isotypes have ADCC (antibody dependent cell-mediated cytotoxicity) activity. [0337] Monoclonal antibodies can be obtained from any suitable species e.g., murine, rabbit, sheep, goat, or human monoclonal antibodies. [0338] A composition, e.g., vaccine, can comprise at about or at least 5, 10, 25, 50 or 100 different antibodies.

[0339] 4d. Antigen Presenting Cell (APC) Based Vaccine [0340] An antigen presenting cell (APC) based vaccine can be formulated using any of the known techniques, carriers, and excipients as suitable and as understood in the art. APCs include monocytes, monocyte-derived cells, macrophages, and dendritic cells. Sometimes, APC based vaccine can be a dendritic cell-based vaccine. [0341] A dendritic cell (DC)-based vaccine can be prepared by any methods known in the art. In some cases, dendritic cell-based vaccines can be prepared through an ex vivo or in vivo method. The ex vivo method can comprise the use of autologous DCs pulsed ex vivo with the peptides described herein, to activate or load the DCs prior to administration into the patient. The in vivo method can comprise targeting specific DC receptors using antibodies coupled with the peptides described herein. The DC-based vaccine can further comprise DC activators such as TLR3, TLR-7-8, and CD40 agonists. The DC-based vaccine can further comprise adjuvants, and a pharmaceutically acceptable carrier.

[0342] 4e. Virus-based Vaccine [0343] A virus-based vaccine can be generated based on live virus or on inactivated virus. Viruses can be engineered to express one or more proteins that comprise any of the sequences described herein. Vaccines based on live virus can use an attenuated virus, or a virus that can be cold-adapted. Vaccines based on inactivated virus can comprise whole virion, split virion, or purified surface antigens (e.g., HA and/or N from influenza A virus). Chemical means for inactivating a virus can include treatment with an effective amount of one or more of the following agents: detergents, formaldehyde, β-propiolactone, methylene blue, psoralen, carboxyfullerene (C60), binary ethylamine, acetyl ethyleneimine, or combinations thereof. Non- chemical methods of viral inactivation are known in the art, such as UV light, heat inactivation, or gamma irradiation. [0344] Virions can be harvested from virus-containing fluids by various methods. For example, a purification process can involve zonal centrifugation using a linear sucrose gradient solution that includes detergent to disrupt the virions. Antigens can be purified, after optional dilution, by diafiltration. [0345] Split virions can be obtained by treating purified virions with detergents (e.g., ethyl ether, polysorbate 80, deoxycholate, tri-N-butyl phosphate, Triton X-100, Triton N101, cetyltrimethylammonium bromide, Tergitol NP9, etc.) to produce subvirion preparations, including the "Tween-ether" splitting process. Methods of splitting influenza viruses are well known in the art. Splitting of the virus can be carried out by disrupting or fragmenting whole virus, whether infectious or non-infectious with a disrupting concentration of a splitting agent. The disruption can result in a full or partial solubilization of the virus proteins, altering the integrity of the virus. Splitting agents can be non-ionic or ionic (e.g., cationic) surfactants e.g., alkylglycosides, alkylthioglycosides, acyl sugars, sulphobetaines, betains, polyoxyethylenealkylethers, Ν ,Ν-dialkyl-Glucamides, Hecameg, alkylphenoxy- polyethoxyethanols, quaternary ammonium compounds, sarcosyl, CTABs (cetyl trimethyl ammonium bromides), tri-N-butyl phosphate, Cetavlon, myristyltrimethylammonium salts, lipofectin, lipofectamine, and DOT-MA, the octyl- or nonylphenoxy polyoxyethanols (e.g., the Triton surfactants, such as Triton X-100 or Triton N101), polyoxyethylene sorbitan esters (the Tween surfactants), polyoxyethylene ethers, polyoxyethlene esters, etc. One exemplary splitting procedure can use the consecutive effects of sodium deoxycholate and formaldehyde, and splitting can take place during initial virion purification (e.g., in a sucrose density gradient solution). Thus a splitting process can involve clarification of the virion-containing material (to remove non-virion material), concentration of the harvested virions (e.g., using an adsorption method, such as CaHP04 adsorption), separation of whole virions from non-virion material, splitting of virions using a splitting agent in a density gradient centrifugation step (e.g., using a sucrose gradient that contains a splitting agent such as sodium deoxycholate), and then filtration (e.g., ultrafiltration) to remove undesired materials. Split virions can usefully be resuspended in sodium phosphate-buffered isotonic sodium chloride solution. The BEGRIVAC™, FLUARIX™, FLUZONE™, and FLUSHIELD™ products are split vaccines. [0346] Purified surface antigen vaccines can comprise the influenza surface antigens haemagglutinin and, typically, also neuraminidase. Processes for preparing these proteins in purified form are well known in the art. The FLUVIRIN™, AGRIPPAL™, and INFLUVAC™ products are examples. [0347] Vaccine based on inactivated virus can include the virosome (nucleic acid free viral- like liposomal particles). Virosomes can be prepared by solubilization of influenza virus with a detergent followed by removal of the nucleocapsid and reconstitution of the membrane containing the viral glycoproteins. Virosomes can also be prepared by adding viral membrane glycoproteins to excess amounts of phospholipids, to yield liposomes with viral proteins in their membrane.

[0348] 5. ROUTES OF ADMINISTRATION, FORMULATIONS, CARRIERS AND EXCIPIENTS, DOSAGES [0349] 5a. Administration routes [0350] A composition described herein, e.g., a vaccine, can be delivered via a variety of routes to a subject, e.g., a human. Delivery routes can include oral (including buccal and sub lingual), rectal, nasal, topical, transdermal, transmucosal, pulmonary, vaginal, suppository, or parenteral (including intramuscular, intra-arterial, intrathecal, intradermal, intraperitoneal, subcutaneous and intravenous) administration or in a form suitable for administration by aerosolization, inhalation or insufflation. General information on drug delivery systems can be found in Ansel et al., Pharmaceutical Dosage Forms and Drug Delivery Systems (Lippencott Williams & Wilkins, Baltimore Md. (1999). The composition, e.g., vaccine, can be administered to muscle, or can be administered via intradermal or subcutaneous injections, or transdermally, such as by iontophoresis. The composition, e.g., vaccine, can be delivered to a subject by epidermal administration. Methods of formulation are known in the art, for example, as disclosed in Remington's Pharmaceutical Sciences, latest edition, Mack Publishing Co., Easton P.

[0351] 5b. Formulations [0352] In some embodiments, a composition provided herein, e.g., a vaccine, is formulated based, in part, on the intended route of administration of the composition. The composition, e.g., vaccine, can comprise one or more active agents, such as one or more peptides, nucleic acids, proteins (e.g., antibodies or fragments thereof), APCs, or viruses described herein. A composition comprising one or more active agents in combination with one or more adjuvants can be formulated in conventional manner using one or more physiologically acceptable carriers, comprising excipients, diluents, and/or auxiliaries, e.g., which facilitate processing of the one or more active agents into preparations that can be administered. The one or more active agents described herein can be delivered to a subject using a number of routes or modes of administration described herein, e.g., oral, buccal, topical, rectal, transdermal, transmucosal, subcutaneous, intravenous, and intramuscular applications, as well as by inhalation. [0353] In some embodiments, a composition described herein, e.g., a vaccine, is a liquid preparation such as a suspension, syrup, or elixir. The composition, e.g., vaccine, can also be a preparation for parenteral, subcutaneous, intradermal, intramuscular, or intravenous administration (e.g., injectable administration), such as a sterile suspension or emulsion. In some cases, aqueous solutions can be packaged for use as is, or lyophilized, and the lyophilized preparation being combined with a sterile solution prior to administration. In some embodiments, the composition, e.g., vaccine, can be delivered as a solution or as a suspension. In general, formulations such as jellies, creams, lotions, suppositories and ointments can provide an area with more extended exposure to one or more active agents, while formulations in solution, e.g., sprays, can provide more immediate, short-term exposure.

[0354] 5bl. Formulationsfor inhalation (e.g., nasal administration or oral inhalation) [0355] In some embodiments, a composition described herein, e.g., a vaccine, is formulated for administration via the nasal passages of a subject. Formulations suitable for nasal administration, wherein the carrier is a solid, can include a coarse powder having a particle size, for example, in the range of about 10 to about 500 microns which is administered in the manner in which snuff is taken, e.g., by rapid inhalation through the nasal passage from a container of the powder held close up to the nose. The formulation can be a nasal spray, nasal drops, or by aerosol administration by nebulizer. The formulation can include aqueous or oily solutions of the vaccine. [0356] A composition provided herein, e.g., a vaccine, can be formulated as an aerosol formulation. The aerosol formulation can be, e.g., an aerosol solution, suspension or dry powder. The aerosol can be administered through the respiratory system or nasal passages. For example, the composition can be suspended or dissolved in an appropriate carrier, e.g., a pharmaceutically acceptable propellant, and administered directly into the lungs using a nasal spray or inhalant. For example, an aerosol formulation comprising one or more active agents can be dissolved, suspended or emulsified in a propellant or a mixture of solvent and propellant, e.g., for administration as a nasal spray or inhalant. The aerosol formulation can contain any acceptable propellant under pressure, such as a cosmetically or dermatologically or pharmaceutically acceptable propellant. [0357] In some embodiments, an aerosol formulation for nasal administration is an aqueous solution designed to be administered to the nasal passages in drops or sprays. Nasal solutions can be similar to nasal secretions in that they can be isotonic and slightly buffered to maintain a H of about 5.5 to about 6.5. In some cases, pH values outside of this range can be used. Antimicrobial agents or preservatives can also be included in the formulation. [0358] An aerosol formulation for inhalation can be designed so that one or more active agents are carried into the respiratory system of the subject when administered by the nasal or oral respiratory route. Inhalation solutions can be administered, for example, by a nebulizer. Inhalations or insufflations, comprising finely powdered or liquid drugs, can be delivered to the respiratory system as a pharmaceutical aerosol of a solution or suspension of the agent or combination of agents in a propellant, e.g., to aid in disbursement. Propellants can be liquefied gases, including halocarbons, for example, fluorocarbons such as fluorinated chlorinated hydrocarbons, hydrochlorofluorocarbons, and hydrochlorocarbons, as well as hydrocarbons and hydrocarbon ethers. [0359] Halocarbon propellants can include fluorocarbon propellants in which all hydrogens are replaced with fluorine, chlorofluorocarbon propellants in which all hydrogens are replaced with chlorine and at least one fluorine, hydrogen-containing fluorocarbon propellants, and hydrogen-containing chlorofluorocarbon propellants. Halocarbon propellants are described, e.g., in Johnson, U.S. Pat. No. 5,376,359, issued Dec. 27, 1994; Byron et al, U.S. Pat. No. 5,190,029, issued Mar. 2, 1993; and Purewal et al, U.S. Pat. No. 5,776,434, issued Jul. 7, 1998. Hydrocarbon propellants can include, for example, propane, isobutane, n-butane, pentane, isopentane, and neopentane. A blend of hydrocarbons can also be used as a propellant. Ether propellants include, for example, dimethyl ether as well as ethers. An aerosol formulation can also comprise more than one propellant. For example, the aerosol formulation can comprise more than one propellant from the same class, such as two or more fluorocarbons; or more than one, more than two, more than three propellants from different classes, such as a fluorohydrocarbon and a hydrocarbon. A composition described herein, e.g., vaccine, can also be dispensed with a compressed gas, e.g., an inert gas such as carbon dioxide, nitrous oxide, or nitrogen. [0360] The aerosol formulation can also include other components, for example, ethanol, isopropanol, propylene glycol, as well as surfactants or other components, such as oils and detergents. These components can serve to stabilize the formulation and/or lubricate valve components. [0361] The aerosol formulation can be packaged under pressure and can be formulated as an aerosol using solutions, suspensions, emulsions, powders, and semisolid preparations. For example, a solution aerosol formulation can comprise a solution of an active agent such in (substantially) pure propellant or as a mixture of propellant and solvent. The solvent can be used to dissolve one or more active agents and/or retard the evaporation of the propellant. Solvents can include, for example, water, ethanol, and glycols. Any combination of suitable solvents can be use, optionally combined with preservatives, antioxidants, and/or other aerosol components. [0362] An aerosol formulation can be a dispersion or suspension. A suspension aerosol formulation can comprise a suspension of one or more active agents, e.g., peptides, and a dispersing agent. Dispersing agents can include, for example, sorbitan trioleate, oleyl alcohol, oleic acid, lecithin, and corn oil. A suspension aerosol formulation can also include lubricants, preservatives, antioxidant, and/or other aerosol components. [0363] An aerosol formulation can similarly be formulated as an emulsion. An emulsion aerosol formulation can include, for example, an alcohol such as ethanol, a surfactant, water, and a propellant, as well as an active agent or combination of active agents, e.g., one or more peptides. The surfactant used can be nonionic, anionic, or cationic. One example of an emulsion aerosol formulation comprises, for example, ethanol, surfactant, water, and propellant. Another example of an emulsion aerosol formulation comprises, for example, vegetable oil, glyceryl monostearate, and propane.

[0364] 5b2. Formulations for parenteral administration [0365] In some embodiments, a composition, e.g., vaccine, comprising one or more active agents is formulated for parenteral administration (e.g., by injection, for example bolus injection or continuous infusion; intravenous administration or intramuscular injection) and can be presented in unit dose form in ampoules, pre-filled syringes, small volume infusion or in multi- dose containers with an added preservative. The composition can take such forms as suspensions, solutions, or emulsions in oily or aqueous vehicles, for example solutions in aqueous polyethylene glycol. [0366] In some embodiments, for injectable formulations, a vehicle can be chosen from those known in the art to be suitable, including aqueous solutions or oil suspensions, or emulsions, with sesame oil, corn oil, cottonseed oil, or peanut oil, as well as elixirs, mannitol, dextrose, or a sterile aqueous solution, and similar pharmaceutical vehicles. The formulation can also comprise polymer compositions which are biocompatible, biodegradable, such as poly(lactic-co-glycolic)acid. These materials can be made into micro or nanospheres, loaded with drug and further coated or derivatized to provide superior sustained release performance. Vehicles suitable for periocular or intraocular injection include, for example, suspensions of active agent in injection grade water, liposomes, and vehicles suitable for lipophilic substances and those known in the art. [0367] In some embodiments, the composition, e.g., vaccine, is formulated for intravenous administration to human beings. The composition, e.g., vaccine, for intravenous administration can be a solution in sterile isotonic aqueous buffer. In some cases, the composition, e.g., vaccine, includes a solubilizing agent and a local anesthetic such as lidocaine to ease pain at the site of the injection. The ingredients can be supplied either separately or mixed together in unit dosage form, for example, as a dry lyophilized powder or water free concentrate in a hermetically sealed container such as an ampoule or sachette indicating the quantity of active agent. Where the composition is to be administered by infusion, it can be dispensed with an infusion bottle containing sterile pharmaceutical grade water or saline. Where the composition is administered by injection, an ampoule of sterile water for injection or saline can be provided so that the ingredients can be mixed prior to administration. [0368] When administration is by injection, a composition, e.g., vaccine, comprising one or more active agents can be formulated in aqueous solutions, specifically in physiologically compatible buffers such as Hanks solution, Ringer's solution, or physiological saline buffer. The solution can contain formulatory agents such as suspending, stabilizing, and/or dispersing agents. Alternatively, the one or more active agents can be in powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-free water, before use. In another embodiment, the composition, e.g., vaccine, does not comprise an adjuvant or any other substance added to enhance the immune response stimulated by the active agent. In another embodiment, the composition, e.g., vaccine, comprises a substance that inhibits an immune response to the one or more active agents. [0369] In some embodiments, one or more active agents are formulated as a depot preparation. Such long acting formulations can be administered by implantation or transcutaneous delivery (e.g., subcutaneously or intramuscularly), intramuscular injection or use of a transdermal patch. Thus, for example, one or more active agents can be formulated with suitable polymeric or hydrophobic materials (e.g., as an emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt.

[0370] 5b3. Formulations for topical administration [0371] In some embodiments, a composition provided herein, e.g., a vaccine, comprises one or more agents that exert local and regional effects when administered topically or injected at or near particular sites of infection. Direct topical application, e.g., of a viscous liquid, solution, suspension, dimethylsulfoxide (DMSO)-based solutions, liposomal formulations, gel, jelly, cream, lotion, ointment, suppository, foam, or aerosol spray, can be used for local administration, to produce for example local and/or regional effects. Pharmaceutically appropriate vehicles for such formulation include, for example, lower aliphatic alcohols, polyglycols (e.g., glycerol or polyethylene glycol), esters of fatty acids, oils, fats, silicones, and the like. Such preparations can also include preservatives (e.g., p-hydroxybenzoic acid esters) and/or antioxidants (e.g., ascorbic acid and tocopherol). See also Dermatological Formulations: Percutaneous absorption, Barry (Ed.), Marcel Dekker Incl, 1983. In some embodiments, local/topical formulations comprising one or more active agents are used to treat epidermal or mucosal viral infections. [0372] A composition provided herein, e.g., a vaccine, can contain a cosmetically or dermatologically acceptable carrier. Such carriers are compatible with skin, nails, mucous membranes, tissues, and/or hair, and can include any cosmetic or dermatological carrier meeting these requirements. Such carriers can be readily selected by one of ordinary skill in the art. In formulating skin ointments, one or more agents can be formulated in an oleaginous hydrocarbon base, an anhydrous absorption base, a water-in-oil absorption base, an oil-in-water water- removable base and/or a water-soluble base. Examples of such carriers and excipients include humectants (e.g., urea), glycols (e.g., propylene glycol), alcohols (e.g., ethanol), fatty acids (e.g., oleic acid), surfactants (e.g., isopropyl myristate and sodium lauryl sulfate), pyrrolidones, glycerol monolaurate, sulfoxides, terpenes (e.g., menthol), amines, amides, alkanes, alkanols, water, calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and polymers such as polyethylene glycols. [0373] Ointments and creams can, for example, be formulated with an aqueous or oily base with the addition of suitable thickening and/or gelling agents. Lotions can be formulated with an aqueous or oily base and will in general also containing one or more emulsifying agents, stabilizing agents, dispersing agents, suspending agents, thickening agents, or coloring agents. The construction and use of transdermal patches for the delivery of pharmaceutical agents is known in the art. See, e.g., U.S. Pat. Nos. 5,023,252, 4,992,445, and 5,001,139. Such patches can be constructed for continuous, pulsatile, or on demand delivery of pharmaceutical agents. [0374] Lubricants which can be used to form compositions and dosage forms can include calcium stearate, magnesium stearate, mineral oil, light mineral oil, glycerin, sorbitol, mannitol, polyethylene glycol, other glycols, stearic acid, sodium lauryl sulfate, talc, hydrogenated vegetable oil (e.g., peanut oil, cottonseed oil, sunflower oil, sesame oil, olive oil, corn oil, and soybean oil), zinc stearate, ethyl oleate, ethyl laureate, agar, or mixtures thereof. Additional lubricants include, for example, a syloid silica gel, a coagulated aerosol of synthetic silica, or mixtures thereof. A lubricant can optionally be added, in an amount of less than about 1 weight percent of the composition. [0375] In some embodiments, a composition provided herein, e.g., a vaccine, can be in any form suitable for topical application, including aqueous, aqueous-alcoholic or oily solutions, lotion or serum dispersions, aqueous, anhydrous or oily gels, emulsions obtained by dispersion of a fatty phase in an aqueous phase (O/W or oil in water) or, conversely, (W/O or water in oil), microemulsions or alternatively microcapsules, microparticles or lipid vesicle dispersions of ionic and/or nonionic type. Other than the one or more active agents, the amounts of the various constituents of the compositions provided herein can be those used in the art. These compositions can constitute protection, treatment or care creams, milks, lotions, gels or foams for the face, for the hands, for the body and/or for the mucous membranes, or for cleansing the skin. The compositions can also consist of solid preparations constituting soaps or cleansing bars.

[0376] A composition provided herein, e.g., a vaccine, for local/topical application can include one or more antimicrobial preservatives such as quaternary ammonium compounds, organic mercurials, p-hydroxy benzoates, aromatic alcohols, chlorobutanol, and the like.

[0377] 5b4. Formulations for oral administration

[0378] In some embodiments, a composition provided herein, e.g., a vaccine, is formulated for oral administration. For oral administration, one or more active agents can be formulated readily by combining the one or more active agents with pharmaceutically acceptable carriers known in the art. Such carriers enable active agents to be formulated as tablets, including chewable tablets, pills, dragees, capsules, lozenges, hard candy, liquids, gels, syrups, slurries, powders, suspensions, elixirs, wafers, and the like, for oral ingestion by a patient to be treated. Such formulations can comprise pharmaceutically acceptable carriers including solid diluents or fillers, sterile aqueous media and various non-toxic organic solvents. A solid carrier can be one or more substances which can also act as diluents, flavoring agents, solubilizers, lubricants, suspending agents, binders, preservatives, tablet disintegrating agents, or an encapsulating material. In powders, the carrier can be a finely divided solid which is a mixture with the finely divided active component. In tablets, the active component generally is mixed with the carrier having the desired binding capacity in suitable proportions and compacted in the shape and size desired. The powders and tablets can contain from about one (1) to about seventy (70) percent of the one or more active agents. Suitable carriers include but are not limited to magnesium carbonate, magnesium stearate, talc, sugar, lactose, pectin, dextrin, starch, gelatin, tragacanth, methylcellulose, sodium carboxymethylcellulose, a low melting wax, cocoa butter, and the like. Generally, the one or more active agents can be included at concentration levels ranging from about 0.5%, about 5%, about 10%, about 20%>, or about 30%> to about 50%>, about 60%>, about

70% , about 80%o, or about 90%> by weight of the total composition of oral dosage forms, in an amount sufficient to provide a desired unit of dosage.

[0379] Aqueous suspensions for oral use can contain one or more active agents with pharmaceutically acceptable excipients, such as a suspending agent (e.g., methyl cellulose), a wetting agent (e.g., lecithin, lysolecithin and/or a long-chain fatty alcohol), as well as coloring agents, preservatives, flavoring agents, and the like. [0380] Oils or non-aqueous solvents can be required to bring the one or more active agents into solution, due to, for example, the presence of large lipophilic moieties. Alternatively, emulsions, suspensions, or other preparations, for example, liposomal preparations, can be used. With respect to liposomal preparations, any known methods for preparing liposomes for treatment of a condition can be used. See, for example, Bangham et al., J. Mol. Biol. 23: 238- 252 (1965) and Szoka et al:, Proc. Natl. Acad. Sci. USA 75: 4194-4198 (1978), incorporated herein by reference. Ligands can also be attached to the liposomes to direct these compositions to particular sites of action.

[0381] Pharmaceutical preparations for oral use can be obtained as a solid excipient, optionally grinding a resulting mixture, and processing the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; flavoring elements, cellulose preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose, and/or polyvinyl pyrrolidone (PVP). If desired, disintegrating agents can be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof such as sodium alginate. The agents can also be formulated as a sustained release preparation. [0382] Dragee cores can be provided with suitable coatings. For this purpose, concentrated sugar solutions can be used, which can optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments can be added to the tablets or dragee coatings for identification or to characterize different combinations of active agents. [0383] Pharmaceutical preparations that can be used orally include push-fit capsules made of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, optionally, stabilizers. In soft capsules, the active agents can be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, stabilizers can be added. All formulations for oral administration can be in dosages suitable for administration. [0384] Other forms suitable for oral administration include liquid form preparations including emulsions, syrups, elixirs, aqueous solutions, aqueous suspensions, or solid form preparations which are intended to be converted shortly before use to liquid form preparations. Emulsions can be prepared in solutions, for example, in aqueous propylene glycol solutions or can contain emulsifying agents, for example, such as lecithin, sorbitan monooleate, or acacia. Aqueous solutions can be prepared by dissolving the active component in water and adding suitable colorants, flavors, stabilizers, and thickening agents. Aqueous suspensions can be prepared by dispersing the finely divided active component in water with viscous material, such as natural or synthetic gums, resins, methylcellulose, sodium carboxymethylcellulose, and other well-known suspending agents. Suitable fillers or carriers with which the compositions can be administered include agar, alcohol, fats, lactose, starch, cellulose derivatives, polysaccharides, polyvinylpyrrolidone, silica, sterile saline and the like, or mixtures thereof used in suitable amounts. Solid form preparations include solutions, suspensions, and emulsions, and can contain, in addition to the active component, colorants, flavors, stabilizers, buffers, artificial and natural sweeteners, dispersants, thickeners, solubilizing agents, and the like. [0385] A syrup or suspension can be made by adding the active compound to a concentrated, aqueous solution of a sugar, e.g., sucrose, to which can also be added any accessory ingredients. Such accessory ingredients can include flavoring, an agent to retard crystallization of the sugar or an agent to increase the solubility of any other ingredient, e.g., as a polyhydric alcohol, for example, glycerol or sorbitol. [0386] When formulating compounds for oral administration, it can be desirable to utilize gastroretentive formulations to enhance absorption from the gastrointestinal (GI) tract. A formulation which is retained in the stomach for several hours can release an active agent slowly and provide a sustained release that can be used herein. Disclosure of such gastro-retentive formulations are found in Klausner, E. A.; Lavy, E.; Barta, M.; Cserepes, E.; Friedman, M.; Hoffman, A. 2003 "Novel gastroretentive dosage forms: evaluation of gastroretentivity and its effect on levodopa in humans" Pharm. Res. 20, 1466-73, Hoffman, A.; Stepensky, D.; Lavy, E.; Eyal, S. Klausner, E.; Friedman, M. 2004 "Pharmacokinetic and pharmacodynamic aspects of gastroretentive dosage forms" Int. J. Pharm. 11, 141-53, Streubel, A.; Siepmann, J; Bodmeier, R.; 2006 "Gastroretentive drug delivery systems" Expert Opin. Drug Deliver. 3, 217-3, and Chavanpatil, M. D.; Jain, P.; Chaudhari, S.; Shear, R.; Vavia, P. R. "Novel sustained release, swellable and bioadhesive gastroretentive drug delivery system for olfoxacin" Int. J. Pharm. 2006 epub March 24. Expandable, floating and bioadhesive techniques can be utilized to maximize absorption an active agent.

[0387] 5b5. Formulations for ophthalmic administration [0388] In some instances, ocular viral infections can be effectively treated with ophthalmic solutions, suspensions, ointments, or inserts comprising one or more active agents. Eye drops can be prepared by dissolving the one or more active agents in a sterile aqueous solution such as physiological saline, buffering solution, etc., or by combining powder compositions to be dissolved before use. Other vehicles can be chosen, as is known in the art, including but not limited to: balance salt solution, saline solution, water soluble polyethers such as polyethyene glycol, polyvinyls, such as polyvinyl alcohol and povidone, cellulose derivatives such as methylcellulose and hydroxypropyl methylcellulose, petroleum derivatives such as mineral oil and white petrolatum, animal fats such as lanolin, polymers of acrylic acid such as carboxypolymethylene gel, vegetable fats such as peanut oil and polysaccharides such as dextrans, and glycosaminoglycans such as sodium hyaluronate. If desired, additives ordinarily used in the eye drops can be added. Such additives include isotonizing agents (e.g., sodium chloride, etc.), buffer agent (e.g., boric acid, sodium monohydrogen phosphate, sodium dihydrogen phosphate, etc.), preservatives (e.g., benzalkonium chloride, benzethonium chloride, chlorobutanol, etc.), thickeners (e.g., saccharide such as lactose, mannitol, maltose, etc.; e.g., hyaluronic acid or its salt such as sodium hyaluronate, potassium hyaluronate, etc.; e.g., mucopolysaccharide such as chondroitin sulfate, etc.; e.g., sodium polyacrylate, carboxyvinyl polymer, crosslinked polyacrylate, polyvinyl alcohol, polyvinyl pyrrolidone, methyl cellulose, hydroxy propyl methylcellulose, hydroxyethyl cellulose, carboxymethyl cellulose, hydroxy propyl cellulose, or other agents known to those skilled in the art).

[0389] 5b6. Otherformulations [0390] In some embodiments, viral infections of the ear is effectively treated with otic solutions, suspensions, ointments, or inserts comprising one or more active agents provided herein. In some embodiments, a composition described herein, e.g., a vaccine, is formulated for administration as a suppository. For example, a low melting wax, such as a mixture of triglycerides, fatty acid glycerides, Witepsol S55 (trademark of Dynamite Nobel Chemical, Germany), or cocoa butter can be first melted and the active component can be dispersed homogeneously, for example, by stirring. The molten homogeneous mixture can then be poured into convenient sized molds, allowed to cool, and to solidify. In some embodiments, a composition described herein, e.g., a vaccine, is formulated for vaginal administration. In some cases, pessaries, tampons, creams, gels, pastes, foams, or sprays contain one or more active agents and one or more carriers as are known in the art.

[0391] 5c. Ingredients, e.g., carriers, excipients [0392] In some embodiments, a composition provided herein, e.g., a vaccine, includes one or more carriers and excipients (including but not limited to buffers, carbohydrates, mannitol, proteins, peptides or amino acids such as glycine, antioxidants, bacteriostats, chelating agents, suspending agents, thickening agents and/or preservatives), water, oils including those of petroleum, animal, vegetable or synthetic origin, such as peanut oil, soybean oil, mineral oil, sesame oil and the like, saline solutions, aqueous dextrose and glycerol solutions, flavoring agents, coloring agents, detackifiers and other acceptable additives, adjuvants, or binders, other pharmaceutically acceptable auxiliary substances as required to approximate physiological conditions, such as pH buffering agents, tonicity adjusting agents, emulsifying agents, wetting agents and the like. Examples of excipients include starch, glucose, lactose, sucrose, gelatin, malt, rice, flour, chalk, silica gel, sodium stearate, glycerol monostearate, talc, sodium chloride, dried skim milk, glycerol, propylene, glycol, water, ethanol, and the like. In another instance, the composition is substantially free of preservatives. In other embodiments, the composition contains at least one preservative. General methodology on pharmaceutical dosage forms is found in Ansel et a , Pharmaceutical Dosage Forms and Drug Delivery Systems (Lippencott Williams & Wilkins, Baltimore Md. (1999)). It will be recognized that, while any suitable carrier known to those of ordinary skill in the art can be employed to administer the pharmaceutical compositions described herein, the type of carrier can vary depending on the mode of administration. Suitable formulations and additional carriers are described in Remington "The Science and Practice of Pharmacy" (20th Ed., Lippincott Williams & Wilkins, Baltimore Md.), the teachings of which are incorporated by reference in their entirety herein.

[0393] 5c1. Liposomes and microspheres [0394] A composition provided herein, e.g., a vaccine, can be encapsulated within liposomes. Biodegradable microspheres can also be employed as carriers for the composition. Suitable biodegradable microspheres are disclosed, for example, in U.S. Pat. Nos. 4,897,268; 5,075,109; 5,928,647; 5,81 1,128; 5,820,883; 5,853,763; 5,814,344 and 5,942,252. [0395] A composition provided herein, e.g., a vaccine, can be administered in liposomes or microspheres (or microparticles). Methods for preparing liposomes and microspheres for administration to a patient are known to those of skill in the art. For example, U.S. Pat. No. 4,789,734, the contents of which are hereby incorporated by reference, describes methods for encapsulating biological materials in liposomes. The material can be dissolved in an aqueous solution, the appropriate phospholipids and lipids added, along with surfactants if required, and the material dialyzed or sonicated, as desired. A review of known methods is provided by G. Gregoriadis, Chapter 14, "Liposomes" Drug Carriers in Biology and Medicine, pp. 2.sup.87-341 (Academic Press, 1979). [0396] Microspheres formed of polymers or proteins are known to those skilled in the art, and can be tailored for passage through the gastrointestinal tract directly into the blood stream. Alternatively, the compound can be incorporated and the microspheres, or composite of microspheres, implanted for slow release over a period of time ranging from days to months. See, for example, U.S. Pat. Nos. 4,906,474, 4,925,673 and 3,625,214, and Jein, TIPS 19:155- 157 (1998), the contents of which are hereby incorporated by reference.

[0397] 5c2. Preservatives/sterility [0398] A composition provided herein, e.g., a vaccine, can include material for a single administration (e.g., immunization), or can include material for multiple administrations (e.g., immunizations) (e.g., a "multidose" kit). The composition, e.g., vaccine, can include one or more preservatives such as thiomersal or 2-phenoxyethanol. In some embodiments, the vaccine is substantially free from (e.g., <10 g/ml) mercurial material e.g., thiomersal-free. In some embodiments, a-Tocopherol succinate is used as an alternative to mercurial compounds. Preservatives can be used to prevent microbial contamination during use. Suitable preservatives include: benzalkonium chloride, thimerosal, chlorobutanol, methyl paraben, propyl paraben, phenylethyl alcohol, edetate disodium, sorbic acid, Onamer M, or other agents known to those skilled in the art. In ophthalmic products, e.g., such preservatives can be employed at a level of from 0.004% to 0.02%. In the compositions of the present application the preservative, e.g., benzalkonium chloride, can be employed at a level of from 0.001% to less than 0.01%, e.g., from 0.001% to 0.008%, preferably about 0.005% by weight. A concentration of benzalkonium chloride of 0.005% can be sufficient to preserve a composition provided herein from microbial attack. [0399] As an alternative (or in addition) to including a preservative in multidose compositions, the composition, e.g., vaccine, can be contained in a container having an aseptic adaptor for removal of material.

[0400] In some embodiments, a composition provided herein, e.g., a vaccine, is sterile. The composition, e.g., vaccine, can be non-pyrogenic e.g., containing < 1 EU (endotoxin unit, a standard measure) per dose, and can be <0.1 EU per dose. The composition, e.g., vaccine, can be formulated as a sterile solution or suspension, in suitable vehicles, known in the art. The composition, e.g., vaccine, can be sterilized by conventional, known sterilization techniques, e.g., the composition can be sterile filtered.

[0401] 5c3. Salts/osmolality [0402] In some embodiments, a composition provided herein comprises one or more salts. For controlling the tonicity, a physiological salt such as sodium salt can be included a composition provided herein, e.g., vaccine. Other salts can include potassium chloride, potassium dihydrogen phosphate, disodium phosphate, and/or magnesium chloride, or the like. In some embodiments, the composition, e.g., vaccine, is formulated with one or more pharmaceutically acceptable salts. The one or more pharmaceutically acceptable salts can include those of the inorganic ions, such as, for example, sodium, potassium, calcium, magnesium ions, and the like. Such salts can include salts with inorganic or organic acids, such as hydrochloric acid, hydrobromic acid, phosphoric acid, nitric acid, sulfuric acid, methanesulfonic acid, p-toluenesulfonic acid, acetic acid, fumaric acid, succinic acid, lactic acid, mandelic acid, malic acid, citric acid, tartaric acid, or maleic acid. If an active agent (e.g., peptide) contains a carboxy group or other acidic group, it can be converted into a pharmaceutically acceptable addition salt with inorganic or organic bases. Examples of suitable bases include sodium hydroxide, potassium hydroxide, ammonia, cyclohexylamine, dicyclohexyl-amine, ethanolamine, diethanolamine, triethanolamine, and the like. [0403] A composition, e.g., vaccine, can have an osmolality of between 200 mOsm/kg and 400 mOsm/kg, between 240-360 mOsm/kg, or within the range of 290-3 10 mOsm/kg.

[0404] 5c4. Buffers/pH [0405] In some embodiments, a composition provided herein, e.g., vaccine, can comprise one or more buffers, such as a Tris buffer; a borate buffer; a succinate buffer; a histidine buffer (e.g., with an aluminum hydroxide adjuvant); or a citrate buffer. Buffers, in some cases, are included in the 5-20 mM range. [0406] In some embodiments, a composition provided herein, e.g., vaccine, has a pH between about 5.0 and about 8.5, between about 6.0 and about 8.0, between about 6.5 and about 7.5, or between about 7.0 and about 7.8.

[0407] 5c5. Detergents/surfactants [0408] In some embodiments, a composition provided herein, e.g., vaccine, includes one or more detergents and/or surfactants, e.g., polyoxyethylene sorbitan esters surfactants (commonly referred to as "Tweens"), e.g., polysorbate 20 and polysorbate 80; copolymers of ethylene oxide (EO), propylene oxide (PO), and/or butylene oxide (BO), sold under the DOWFAX™ tradename, such as linear EO/PO block copolymers; octoxynols, which can vary in the number of repeating ethoxy (oxy-l,2-ethanediyl) groups, e.g., octoxynol-9 (Triton X-100, or t- octylphenoxypolyethoxyethanol); (octylphenoxy)polyethoxyethanol (IGEPAL CA-630/NP-40); phospholipids such as phosphatidylcholine (lecithin); nonylphenol ethoxylates, such as the Tergitol™ NP series; polyoxyethylene fatty ethers derived from lauryl, cetyl, stearyl and oleyl alcohols (known as Brij surfactants), such as triethyleneglycol monolauryl ether (Brij 30); and sorbitan esters (commonly known as "SPANs"), such as sorbitan trioleate (Span 85) and sorbitan monolaurate, an octoxynol (such as octoxynol-9 (Triton X-100) or t- octylphenoxypolyethoxyethanol), a cetyl trimethyl ammonium bromide ("CTAB"), or sodium deoxycholate, particularly for a split or surface antigen vaccine. The one or more detergents and/or surfactants can be present only at trace amounts. In some cases, the composition, e.g., vaccine, can include less than 1 mg/ml of each of octoxynol-10 and polysorbate 80. Non-ionic surfactants can be used herein. Surfactants can be classified by their "HLB" (hydrophile/lipophile balance). In some cases, surfactants have a HLB of at least 10, at least 15, and/or at least 16.

[0409] In some embodiments, mixtures of surfactants is used in a composition e.g., vaccine, e.g., Tween 80/Span 85 mixtures. A combination of a polyoxyethylene sorbitan ester and an octoxynol can also be suitable. Another combination can comprise laureth 9 plus a polyoxyethylene sorbitan ester and/or an octoxynol. The amounts of surfactants (% by weight) can be: polyoxyethylene sorbitan esters (such as Tween 80) 0.01 to 1%, in particular about 0.1%; octyl- or nonylphenoxy polyoxyethanols (such as Triton X-100, or other detergents in the Triton series) 0.001 to 0.1%, in particular 0.005 to 0.02%>; polyoxyethylene ethers (such as laureth 9) 0.1 to 20%, preferably 0.1 to 10% and in particular 0.1 to 1% or about 0.5%.

[0410] 5c6. Adjuvants [0411] In some embodiments, a composition provided herein, e.g., vaccine, comprises one or more adjuvants. An adjuvant can be used to enhance the immune response (humoral and/or cellular) elicited in a subject receiving the vaccine. Sometimes, an adjuvant can elicit a Thl- type response. In some cases, an adjuvant can elicit a Th2-type response. A Thl-type response can be characterized by the production of cytokines such as IFN-γ as opposed to a Th2-type response which can be characterized by the production of cytokines such as IL-4, IL-5, and IL- 10.

[0412] In some embodiments, lipid-based adjuvants, such as MPL and MDP, is used with a composition, e.g., vaccine, disclosed herein. Monophosphoryl lipid A (MPL), for example, is an adjuvant that can cause increased presentation of liposomal antigen to specific T Lymphocytes. In addition, a muramyl dipeptide (MDP) can also be used as a suitable adjuvant in conjunction with a composition, e.g., vaccine, described herein.

[0413] Adjuvant can also comprise stimulatory molecules such as cytokines. Non-limiting examples of cytokines include: CCL20, a-interferon(IFN- a), β-interferon (IFN- β), γ - interferon, platelet derived growth factor (PDGF), TNFa, TNFp, GM-CSF, epidermal growth factor (EGF), cutaneous T cell-attracting chemokine (CTACK), epithelial thymus-expressed chemokine (TECK), mucosae-associated epithelial chemokine (MEC), IL-12, IL-15, , IL-28, MHC, CD80, CD86, IL-1, IL-2, IL-4, IL-5, IL-6, IL-10, IL-18, MCP-1, MIP-la, MIP-1-, IL-8, L- selectin, P- selectin, E-selectin, CD34, GlyCAM-1, MadCAM-1, LFA-1, VLA-1, Mac-1, pl50.95, PECAM, ICAM-1, ICAM-2, ICAM-3, CD2, LFA-3, M-CSF, G-CSF, mutant forms of IL-18, CD40, CD40L, vascular growth factor, fibroblast growth factor, IL-7, nerve growth factor, vascular endothelial growth factor, Fas, TNF receptor, Fit, Apo-1, p55, WSL-1, DR3, TRAMP, Apo-3, AIR, LARD, NGRF, DR4, DRS, KILLER, TRAIL-R2, TRICK2, DR6, Caspase ICE, Fos, c-jun, Sp-1, Ap-1, Ap-2, p38, p65Rel, MyD88, IRAK, TRAF6, IkB, Inactive NIK, SAP K, SAP-I, JNK, interferon response genes, NFkB, Bax, TRAIL, TRAILrec, TRAILrecDRC5, TRAIL-R3, TRAIL-R4, RANK, RANK LIGAND, Ox40, Ox40 LIGAND, NKG2D, MICA, MICB, NKG2A, NKG2B, NKG2C, NKG2E, NKG2F, TAPI, and TAP2. [0414] Additional adjuvants include: MCP-1 , MIP-la, MIP-lp, IL-8, RANTES, L-selectin, P- selectin, E-selectin, CD34, GlyCAM-1, MadCAM-1, LFA-1, VLA-1, Mac-1, pl50.95, PECAM, ICAM-1, ICAM-2, ICAM-3, CD2, LFA-3, M-CSF, G-CSF, IL-4, mutant forms of IL-18, CD40, CD40L, vascular growth factor, fibroblast growth factor, IL-7, IL-22, nerve growth factor, vascular endothelial growth factor, Fas, TNF receptor, Fit, Apo-1, p55, WSL-1, DR3, TRAMP, Apo-3, AIR, LARD, NGRF, DR4, DR5, KILLER, TRAIL-R2, TRICK2, DR6, Caspase ICE, Fos, c-jun, Sp-1, Ap-1, Ap-2, p38, p65Rel, MyD88, IRAK, TRAF6, IkB, Inactive NIK, SAP K, SAP-1, JNK, interferon response genes, NFkB, Bax, TRAIL, TRAILrec, TRAILrecDRC5, TRAIL-R3, TRAIL-R4, RANK, RANK LIGAND, Ox40, Ox40 LIGAND, NKG2D, MICA, MICB, NKG2A, NKG2B, NKG2C, NKG2E, NKG2F, TAPI, TAP2 and functional fragments thereof. [0415] In some embodiments, the one or more adjuvants is a modulator of a toll like receptor. Examples of modulators of toll-like receptors include TLR-9 agonists and TLR-2 agonists and are not limited to small molecule modulators of toll-like receptors such as Imiquimod (R837). Other examples of adjuvants that can be used a composition described herein, e.g., a vaccine, include saponin, CpG ODN and the like. [0416] In some cases, the one or more adjuvants is selected from bacteria toxoids, polyoxypropylene-polyoxyethylene block polymers, aluminum salts, liposomes, CpG polymers, oil-in-water emulsions, or a combination thereof.

[0417] Sometimes, the one or more adjuvants is based on aluminum salts (alum) or derivatives thereof. Exemplary Alums can comprise aluminum hydroxide, aluminum phosphate, potassium aluminum sulfate, sodium aluminum sulfate, ammonium aluminum sulfate, cesium aluminum sulfate, or a mixture of aluminum and magnesium hydroxide. Alum can also comprise crystalline aluminum oxyhydroxide (AIOOH). Sometimes, AIOOH adjuvants can compose of nanolength scale plate-like primary particles that form aggregates, representing the functional subunits in the material. These aggregates can be porous and can have irregular shapes that range from about 1 to about 20 µιη in diameter. Upon mixing with antigen, the aggregates can be broken into smaller fragments that can reaggregate to distribute the absorbed antigen throughout the vaccine. In some embodiments, the adjuvant comprises ordered rod-like AIO(OH) naonparticles. In some embodiments, Alum is an alum described in Sun et al., "Engineering an effective immune adjuvant by designed control of shape and crystallinity of aluminum oxyhydroxide nanoparticles" ACS Nano 7(12): 10834-10849, which is incorporated by reference in its entirety. [0418] In some embodiments, the one or more adjuvants are an oil-in-water emulsion. The oil-in-water emulsion can include at least one oil and at least one surfactant, with the oil(s) and surfactant(s) being biodegradable and biocompatible. The oil droplets in the emulsion can be less than 5 µιη in diameter, and can even have a sub-micron diameter, with these small sizes being achieved with a microfluidiser to provide stable emulsions. Droplets with a size less than 220 nm can be preferred as they can be subjected to filter sterilization. [0419] The oils used can include such as those from an animal (such as fish) or vegetable source. Sources for vegetable oils can include nuts, seeds, and grains. Peanut oil, soybean oil, coconut oil, and olive oil, the most commonly available, exemplify the nut oils. Jojoba oil can be used e.g., obtained from the jojoba bean. Seed oils include safflower oil, cottonseed oil, sunflower seed oil, sesame seed oil, etc. The grain group can include: corn oil and oils of other cereal grains such as wheat, oats, rye, rice, teff, triticale, and the like. 6-10 carbon fatty acid esters of glycerol and 1,2-propanediol, while not occurring naturally in seed oils, may be prepared by hydrolysis, separation and esterification of the appropriate materials starting from the nut and seed oils. Fats and oils from mammalian milk can be metabolizable and can be used in with the compositions, e.g., vaccines described herein. The procedures for separation, purification, saponification, and other means for obtaining pure oils from animal sources are known in the art. Fish can contain metabolizable oils which can be readily recovered. For example, cod liver oil, shark liver oils, and whale oil such as spermaceti can exemplify several of the fish oils which can be used herein. A number of branched chain oils can be synthesized biochemically in 5-carbon isoprene units and can be generally referred to as terpenoids. Shark liver oil contains a branched, unsaturated terpenoid known as squalene, 2,6,10,15,19,23- hexamethyl-2,6,10,14,18,22-tetracosahexaene. Squalane, the saturated analog to squalene, can also be used. Fish oils, including squalene and squalane, can be readily available from commercial sources or can be obtained by methods known in the art. [0420] Other useful oils include tocopherols, which can be included in a composition described herein, e.g., a vaccine, for use in elderly patients (e.g., aged 60 years or older), as vitamin E can have a positive effect on the immune response in this subject group. Further, tocopherols can have antioxidant properties that can help to stabilize the emulsions. Various tocopherols exist (α, β, γ , δ, ε or ξ); in some cases, a is used. An example of a-tocopherol is DL-a-tocopherol. a-tocopherol succinate can be compatible with compositions provided herein, e.g., influenza vaccines, and can be a useful preservative as an alternative to mercurial compounds. In some embodiments, mixtures of oils can be used e.g., squalene and a-tocopherol. An oil content in the range of 2-20% (by volume) can be used. [0421] Specific oil-in-water emulsion adjuvants include, e.g., a submicron emulsion of squalene, polysorbate 80, and sorbitan trioleate. The composition of the emulsion by volume can be about 5% squalene, about 0.5%> polysorbate 80 and about 0.5%> Span 85. In weight terms, these ratios become 4.3% squalene, 0.5% polysorbate 80 and 0.48% Span 85. This adjuvant is known as "MF59". The MF59 emulsion advantageously includes citrate ions e.g., 10 mM sodium citrate buffer. [0422] An oil-in water emulsion can be a submicron emulsion of squalene, a tocopherol, and polysorbate 80. These emulsions can have from 2 to 10% squalene, from 2 to 10% tocopherol and from 0.3 to 3% polysorbate 80, and the weight ratio of squalene :tocopherol can be preferably ≤ 1 (e.g., 0.90) as this can provide a more stable emulsion. Squalene and polysorbate 80 may be present at a volume ratio of about 5:2 or at a weight ratio of about 11:5. One such emulsion can be made by dissolving Tween 80 in PBS to give a 2% solution, then mixing 90 ml of this solution with a mixture of (5 g of DL-a-tocopherol and 5 ml squalene), then microfluidising the mixture. The resulting emulsion has submicron oil droplets e.g., with an average diameter of between 100 and 250 nm, preferably about 180 nm. The emulsion may also include a 3-de-O-acylated monophosphoryl lipid A (3d-MPL). Another useful emulsion of this type may comprise, per human dose, 0.5-10 mg squalene, 0.5-1 1 mg tocopherol, and 0.1-4 mg polysorbate 80. [0423] An oil-in water emulsion can be an emulsion of squalene, a tocopherol, and a Triton detergent (e.g., Triton X-100). The emulsion can also include a 3d-MPL (see below). The emulsion can contain a phosphate buffer. [0424] An oil-in water emulsion can be an emulsion comprising a polysorbate (e.g., polysorbate 80), a Triton detergent (e.g., Triton X-100) and a tocopherol (e.g., an a-tocopherol succinate). The emulsion can include these three components at a mass ratio of about 75:11:10 (e.g., 750 µ/ml polysorbate 80, 110 ml Triton X-100 and 100 µ/ml α-tocopherol succinate), and these concentrations should include any contribution of these components from antigens. The emulsion can also include squalene. The emulsion may also include a 3d-MPL. The aqueous phase can contain a phosphate buffer. [0425] An oil-in water emulsion can be an emulsion of squalane, polysorbate 80, and poloxamer 401 ("Pluronic™ L121"). The emulsion can be formulated in phosphate buffered saline, pH 7.4. This emulsion can be a useful delivery vehicle for muramyl dipeptides, and can be used with threonyl-MDP in the "SAF-1" adjuvant (0.05-1% Thr-MDP, 5% squalane, 2.5% Pluronic L121 and 0.2% polysorbate 80). It can also be used without the Thr-MDP, as in the "AF" adjuvant (5% squalane, 1.25% Pluronic L121 and 0.2% polysorbate 80). [0426] An oil-in water emulsion can be an emulsion comprising squalene, an aqueous solvent, a polyoxyethylene alkyl ether hydrophilic nonionic surfactant (e.g., polyoxyethylene (12) cetostearyl ether) and a hydrophobic nonionic surfactant (e.g., a sorbitan ester or mannide ester, such as sorbitan monoleate or "Span 80"). The emulsion can be thermoreversible and/or has at least 90% of the oil droplets (by volume) with a size less than 200 nm. The emulsion can also include one or more of: alditol; a cryoprotective agent (e.g., a sugar, such as dodecylmaltoside and/or sucrose); and/or an alkylpolyglycoside. The emulsion can include a TLR4 agonist. Such emulsions can be lyophilized. [0427] An oil-in water emulsion can be an emulsion of squalene, poloxamer 105 and Abil- Care. The final concentration (weight) of these components in adjuvanted vaccines can be 5% squalene, 4% poloxamer 105 (pluronic polyol) and 2% Abil-Care 85 (Bis-PEG/PPG-16/16 PEG/PPG-16/16 dimethicone; caprylic/capric triglyceride). [0428] An oil-in water emulsion can be an emulsion having from 0.5-50% of an oil, 0.1-10% of a phospholipid, and 0.05-5% of a non-ionic surfactant. Phospholipid components can include phosphatidylcholine, phosphatidylethanolamine, phosphatidylserine, phosphatidylinositol, phosphatidylglycerol, phosphatidic acid, sphingomyelin, and cardiolipin. Submicron droplet sizes are advantageous. [0429] An oil-in water emulsion can be a submicron oil-in-water emulsion of a non- metabolisable oil (such as light mineral oil) and at least one surfactant (such as lecithin, Tween 80 or Span 80). Additives can include, QuilA saponin, cholesterol, a saponin-lipophile conjugate (such as GPI-OlOO, produced by addition of aliphatic amine to desacylsaponin via the carboxyl group of glucuronic acid), dimethyldioctadecylammonium bromide, and/or N,N- dioctadecyl-N,N-bis (2-hydroxyethyl)propanediamine . [0430] In some embodiments, a composition provided herein, e.g., vaccine, contains adjuvants such as hydrophilic or lipophilic gelling agents, hydrophilic or lipophilic active agents, preserving agents, antioxidants, solvents, fragrances, fillers, sunscreens, odor-absorbers, and dyestuffs. The amounts of these various adjuvants can be those used in the fields considered and, for example, are from about 0.01% to about 20% of the total weight of the composition. Depending on their nature, these adjuvants can be introduced into a fatty phase, into an aqueous phase and/or into lipid vesicles. [0431] A composition provided herein, e.g., a vaccine, comprising one or more active agent such as a peptide, a nucleic acid molecule, an antibody or fragments thereof, an APC, and/or virus described herein, in combination with one or more adjuvants, can be formulated to comprise certain molar ratios. For example, molar ratios of about 99: 1 to about 1:99 of an active agent in combination with one or more adjuvants can be used. In some embodiments, the range of molar ratios of an active agent in combination with one or more adjuvants can be selected from about 80:20 to about 20:80; about 75:25 to about 25:75, about 70:30 to about 30:70, about 66:33 to about 33:66, about 60:40 to about 40:60; about 50:50; and about 90:10 to about 10:90. The molar ratio of an active agent in combination with one or more adjuvants can be about 1:9, and in some cases can be about 1:1. The active agent such as a peptide, a nucleic acid molecule, an antibody or fragments thereof, an APC, and/or virus described herein, in combination with one or more adjuvants can be formulated together, in the same dosage unit e.g., in one vial, suppository, tablet, capsule, an aerosol spray; or each agent, form, and/or compound can be formulated in separate units, e.g., two vials, suppositories, tablets, two capsules, a tablet and a vial, an aerosol spray, and the like. [0432] In some embodiments, a composition provided herein, e.g., vaccine, comprises one or more adjuvants selected from the list Alum, monophosphoryl lipid A (MPL), imiquimod (R837) (a small synthetic antiviral molecule -TLR7 ligand), Pam2Cys, and ordered rod-like AIO(OH) nanoparticles (Rod). In some embodiments, a composition provided herein, e.g., vaccine, comprises Alum. In some embodiments, a composition, e.g., vaccine, provided herein comprises Rod. In some embodiments, a composition provided herein, e.g., vaccine, comprises Rod, MPL, and R837. In some embodiments, a composition provided herein, e.g., vaccine, comprises MPL and R837. In some embodiments, a composition provided herein, e.g., vaccine, comprises Alum, MPL, and R837. In some embodiments, a composition provided herein, e.g., vaccine, comprises Pam2Cys.

[0433] 5c 7. Additional agents [0434] In some embodiments, a composition provided herein, e.g., a vaccine, is administered with an additional active agent. The choice of the additional active agent can depend, at least in part, on the condition being treated. The additional active agent can include, for example, any active agent having a therapeutic effect for a pathogen infection (e.g., viral infection), including, e.g., drugs used to treat inflammatory conditions such as an NSAID, e.g., ibuprofen, naproxen, acetaminophen, ketoprofen, or aspirin. In some embodiments, a formulation for treating or preventing an influenza infection can contain one or more conventional influenza antiviral agents, such as Vitamin D, amantadine, arbidol, laninamivir, rimantadine, zanamivir, peramivir, and oseltamivir. In treatments for retroviral infections, such as HIV, formulations can contain one or more conventional antiviral drug, such as protease inhibitors (lopinavir/ritonavir (Kaletra®), indinavir (Crixivan®), ritonavir (Norvir®), nelfmavir (Viracept®), saquinavir hard gel capsules (Invirase®), atazanavir (Reyataz®), amprenavir (Agenerase®), fosamprenavir (Telzir®), tipranavir(Aptivus®)), reverse transcriptase inhibitors, including non-Nucleoside and Nucleoside/nucleotide inhibitors (AZT (zidovudine, Retrovir®), ddl (didanosine, Videx®), 3TC (lamivudine, Epivir®), d4T (stavudine, Zerit®), abacavir (Ziagen®), FTC (emtricitabine, Emtriva®), tenofovir (Viread®), efavirenz (Sustiva®) and nevirapine (Viramune®)), fusion inhibitors T20 (enfuvirtide, Fuzeon®), integrase inhibitors (MK-0518 and GS-9137), and maturation inhibitors (PA-457 (Bevirimat®)). As another example, formulations can additionally contain one or more supplements, such as vitamin C, vitamin E, and other vitamins and anti-oxidants. [0435] In some embodiments, a composition provided herein, e.g., a vaccine, can include one or more antibiotics (e.g., neomycin, kanamycin, polymyxin B). [0436] In some embodiments, the composition, e.g., a vaccine, is gluten free.

[0437] 5c8. Co-solvents [0438] In some embodiments, the solubility of the components of a composition provided herein can be enhanced by a co-solvent in the composition. Such co-solvents include polysorbate 20, 60, and 80, Pluronic F68, F-84 and P-103, cyclodextrin, or other agents known to those skilled in the art. Such co-solvents can be employed at a level of from about 0.01% to 2% by weight.

[0439] 5c9. Penetration enhancers [0440] In some embodiments, a composition provided herein, e.g., a vaccine, includes one or more penetration enhancers. For example, the composition can comprise suitable solid or gel phase carriers or excipients that increase penetration or help delivery of agents or combinations of agents of the invention across a permeability barrier, e.g., the skin. Examples of penetration- enhancing compounds include, e.g., water, alcohols (e.g., terpenes like methanol, ethanol, 2- propanol), sulfoxides (e.g., dimethyl sulfoxide, decylmethyl sulfoxide, tetradecylmethyl sulfoxide), pyrrolidones (e.g., 2-pyrrolidone, N-methyl-2-pyrrolidone, N-(2- hydroxyethyl)pyrrolidone), laurocapram, acetone, dimethylacetamide, dimethylformamide, tetrahydrofurfuryl alcohol, L-a-amino acids, anionic, cationic, amphoteric or nonionic surfactants (e.g., isopropyl myristate and sodium lauryl sulfate), fatty acids, fatty alcohols (e.g., oleic acid), amines, amides, clofibric acid amides, hexamethylene lauramide, proteolytic enzymes, a-bisabolol, d-limonene, urea and N,N-diethyl-m-toluamide, and the like. Additional examples include humectants (e.g., urea), glycols (e.g., propylene glycol and polyethylene glycol), glycerol monolaurate, alkanes, alkanols, ORGELASE, calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and/or other polymers. In some embodiments, the compositions will include one or more such penetration enhancers. [0441] 5c10. Additivesfor sustained releaseformulations [0442] In some embodiments, one or more active agents can be attached releasably to biocompatible polymers for use in sustained release formulations on, in or attached to inserts for topical, intraocular, periocular, or systemic administration. The controlled release from a biocompatible polymer can be utilized with a water soluble polymer to form an instillable formulation. The controlled release from a biocompatible polymer, such as PLGA microspheres or nanospheres, can be utilized in a formulation suitable for intra ocular implantation or injection for sustained release administration. Any suitable biodegradable and biocompatible polymer can be used.

[0443] 6. THERAPEUTIC REGIMENS [0444] A composition provided herein, e.g., a vaccine, can be administered to a subject in a dosage volume of about 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.7, 0.8, 0.9, 1.0 mL, or more. In some embodiments, a half dose, e.g., about 0.25 mL, is administered to a child. Sometimes the vaccine can be administered in a higher dose, e.g., about 1 ml.

[0445] The composition, e.g., vaccine, can be administered as a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more dose-course regimen. Sometimes, the vaccine is administered as a 2, 3, or 4 dose-course regimen. Sometimes the vaccine is administered as a 2 dose-course regimen. [0446] The administration of the first dose and second dose of the 2 dose-course regimen can be separated by about 0 day, 1 day, 2 days, 5 days, 7 days, 14 days, 2 1 days, 30 days, 2 months, 4 months, 6 months, 9 months, 1 year, 1.5 years, 2 years, 3 years, 4 years, 5 years, 10 years, 20 years, or more. [0447] In some embodiments, a composition described herein, e.g., vaccine, is administered to a subject once a year, twice a year, three times a year, every 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more years. Sometimes, the composition, e.g., vaccine, is administered to a subject every 2, 3, 4, 5, 6, 7, or more years. Sometimes, the composition, e.g., vaccine, is administered every 4, 5, 6, 7, or more years. Sometimes, the composition, e.g., vaccine, is administered to a subject once. Sometimes, the composition, e.g., vaccine, is taken by a subject as a multiple dose vaccine over a period of time, e.g., a 2-dose vaccine wherein the second dose is taken 4-5 years after the first dose. [0448] The dosage examples are not limiting and are only used to exemplify particular dosing regiments for administering a composition, e.g., vaccine, described herein. In some embodiments, a "therapeutically effective amount" for use in a human can be determined from an animal model. For example, a dose for a human can be formulated to achieve circulating, liver, topical, and/or gastrointestinal concentrations that have been found to be therapeutically effective in an animal. Based on animal data, and other types of similar data, those skilled in the art can determine a therapeutically effective amount of a composition, e.g., vaccine, appropriate for administration to a human. [0449] The term "therapeutically effective amount" as used herein can mean an amount which is effective to alleviate, ameliorate, or prevent a symptom or sign of a disease or condition to be treated. For example, in some embodiments, a therapeutically effective amount is an amount which has a beneficial effect in a subject having signs and/or symptoms of a viral infection, e.g., an influenza infection, e.g., an influenza A infection. In some embodiments, a therapeutically effective amount is an amount which inhibits or reduces signs and/or symptoms of a viral infection, e.g., an influenza infection, e.g., an influenza A infection, as compared to a control. Signs and symptoms of an influenza infection, e.g., an influenza A infection, are well- known in the art and can include fever, cough, sore throat, runny nose, stuffy nose, headache, muscle aches, chills, fatigue (tiredness), nausea, vomiting, diarrhea, pain (e.g., abdominal pain), conjunctivitis, shortness of breath, difficulty breathing, pneumonia, acute respiratory distress, viral pneumonia, respiratory failure, neurologic change (e.g., altered mental status, seizure), or a combination thereof. In some embodiments, the therapeutically effective amount is one which is sufficient to reduce any of the signs and/or symptoms by about, or at least 10%, 20%>, 30%>, 40%>, 50%, 60%, 70%, 80%, 90%, 95%, or 100% in a subject as compared to a control. [0450] A therapeutically effective amount, when referring to one or more active agents, can be a dose range, mode of administration, formulation, etc., that has been recommended or approved by any of the various regulatory or advisory organizations in the medical or pharmaceutical arts (e.g., FDA, AMA) or by the manufacturer or supplier. [0451] A composition provided herein, e.g., a vaccine can be administered before, during, or after the onset of a symptom associated with a pathogen infection, e.g., an influenza A infection. Exemplary symptoms can include fever, cough, sore throat, runny nose, stuffy nose, headache, muscle aches, chills, fatigue, nausea, vomiting, diarrhea, pain (e.g., abdominal pain), conjunctivitis, shortness of breath, difficulty breathing, pneumonia, acute respiratory distress, viral pneumonia, respiratory failure, neurologic change (e.g., altered mental status, seizure), or a combination thereof. In some embodiments, the composition, e.g., vaccine, is administered to a subject in order to treat a pathogen infection, e.g., influenza infection, e.g., influenza A infection. In some embodiments, the composition, e.g., vaccine, is administered to a subject for a preventive purpose, such as a prophylactic treatment of a pathogen infection, e.g., influenza infection, e.g., influenza A infection. In some embodiments, the composition, e.g., vaccine, is administered to a subject to illicit an immune response from a subject. In some embodiments, the composition, e.g., vaccine, is administered to a subject to illicit an immune response from the subject prior to a pathogen infection, during a pathogen infection, or as a prophylactic measure against a pathogen infection. [0452] A complication from an influenza infection, e.g, an influenza A infection, can be, e.g., pneumonia, bronchitis, sinus infection, or an ear infection. In some cases, an influenza infection, e.g., an influenza A infection, can make a chronic health problem worse, e.g., a person with asthma may experience an asthma attack while the person has an influenza infection, or a person with chronic congestive heart failure can experience worsening of this condition while the person has a influenza infection. In some embodiments, a therapeutically effective amount of a composition, e.g., a vaccine described herein, is administered to a subject with a complication from an influenza infection. In some embodiments, a therapeutically effective amount of a composition, e.g., a vaccine described herein, is administered to a subject with an influenza infection (e.g., an influenza A infection) and one or more chronic health problems.

[0453] In some embodiments, a composition provided herein, e.g., a vaccine, or a kit described herein is stored at between 2°C and 8°C. In some embodiments, the composition, e.g., vaccine, is stored at room temperature. In some embodiments, the composition, e.g., vaccine, is not stored frozen. In some embodiments, the composition, e.g., vaccine, is stored in a temperature such as at -20°C or -80°C. In some embodiments, the composition, e.g., vaccine, is stored away from sunlight.

[0454] 7. KITS [0455] Kits and articles of manufacture are also provided herein for use with one or more methods described herein. The kits can contain one or more peptides with a sequence described herein, such as one or more peptides comprising, consisting essentially of, or consisting of a sequence selected from the group consisting of SEQ ID NOs: 1-68, or a portion of at least 8 amino acids thereof, or peptides comprising, consisting essentially of, or consisting of a sequence with at least 40%, 50%, 60%, 70%, 80%, 90%, or 95% sequence homology to a sequence selected from the group consisting of SEQ ID NOs: 1-68. The kits can contain one or more nucleic acid molecules that encode one or more of the peptides described herein, antibodies that recognize one or more of the amino acid sequences described herein, or APC- based cells activated with one or more of the peptides described herein. The kits can further contain adjuvants, reagents, and/or buffers desired for the makeup and delivery of the vaccines. [0456] The kits can also include a carrier, package, or container that is compartmentalized to receive one or more containers such as vials, tubes, and the like, each of the container(s) comprising one of the separate elements, such as a peptide and adjuvant, to be used in a method described herein. Suitable containers include, for example, bottles, vials, syringes, and test tubes. The containers can be formed from a variety of materials such as glass or plastic. [0457] The articles of manufacture provided herein contain packaging materials. Examples of pharmaceutical packaging materials include blister packs, bottles, tubes, bags, containers, bottles, and any packaging material suitable for a selected formulation and intended mode of administration and treatment. [0458] The kits can include labels listing contents and/or instructions for use, and package inserts with instructions for use. A set of instructions will also typically be included.

[0459] 8. EXAMPLES [0460] These examples are provided for illustrative purposes only and not to limit the scope of the claims provided herein. [0461] Peptide sequences derived from influenza A virus using one or more of the methods described herein are listed in Table 1 and Table 2. Raw sequencing data derived from a method described herein for several of the sequences listed in Table 1 and Table 2 can be accessed under the NIH Short Read Archive under the accession number: SRP033450. [0462] Table 1 SEQ ID NO: 14 NESADMSIGVTVI

SEQ ID NO: 15 NESADMGIGVTVI

SEQ ID NO: 16 LGPATAQMALQLFIK

SEQ ID NO: 17 LCPATAQMALQLFIK

SEQ ID NO: 18 NEKKAKLAN

SEQ ID NO: 19 SPGMMMGMFN

SEQ ID NO: 20 SPGVMMGMFN

SEQ ID NO: 2 1 EMATKADY PA SEQ ID NO: 22 EMATTADY

SEQ ID NO: 23 YHANNSTDTVDTILEKNV

HA SEQ ID NO: 24 YHSNNSTDTVDTILEKNV

SEQ ID NO: 25 YHANNSTDTVDTILEQNV

SEQ ID NO: 26 PvGINDRNFW

SEQ ID NO: 27 PvWINDRNFW

SEQ ID NO: 28 ELRSRYWA

SEQ ID NO: 29 ELRSRHWA

SEQ ID NO: 30 SFQGRGVFEL

SEQ ID NO: 3 1 SFQGRGVFEF

NP SEQ ID NO: 32 LSTRGVQI

SEQ ID NO: 33 WHSNLNDATYQRTRALV

SEQ ID NO: 34 WHSNLNDSTYQRTRALV

SEQ ID NO: 35 WHSNLNDATYQRTRSLV

SEQ ID NO: 36 WHSNLNDTTYQRTRALV

SEQ ID NO: 37 WHSNLNDTTYQRTRSLV

SEQ ID NO: 38 AYERMCNILKGKFQT

NA SEQ ID NO: 39 FVIREPFI SEQ ID NO: 40 FSYKYGNGVW

SEQ ID NO: 4 1 FSFKYGNGVW

SEQ ID NO: 42 CMRPCFWVELI

SEQ ID NO: 43 CIRPCFWVELI

SEQ ID NO: 44 CMPvPFFWVELI

SEQ ID NO: 45 GPDDGAVAVLKY

SEQ ID NO: 46 GPDNGAVAVLKY

SEQ ID NO: 47 GEAPSPYNSRFESVAW

SEQ ID NO: 48 GKAPSPYNSPvFESVAW

SEQ ID NO: 49 GEVPSPYNSRFESVAW

SEQ ID NO: 50 GEAPSPYNSKFESVAW

SEQ ID NO: 5 1 KNTDLEVLMEWLKTRPILSPL

Segment 7 SEQ ID NO: 52 KSTDLEVLMEWLKTRPILSPL

SEQ ID NO: 53 KNTDLEALMEWLKTRPILSPL able 2

Proteins Sequences PB2 SEQ ID NO: 54 GEKANVLIGQGDVVLVMKRK PB1 SEQ ID NO: 55 YSHGTGTGY SEQ ID NO: 56 SPGMMMGMF SEQ ID NO: 57 ERGKLKRRAIATPGMQ NA SEQ ID NO: 58 SSLCPIRGWAIHSK SEQ ID NO: 59 GEAPSPYNSRF SEQ ID NO: 60 VGEAPSPYNSRFESVAW SEQ ID NO: 6 1 GPDDGAVAVL SEQ ID NO: 62 GAVAVLKY NP SEQ ID NO: 63 WHSNLNDATY SEQ ID NO: 64 ATYQRTRAL SEQ ID NO: 65 ATYQRTRALV SEQ ID NO: 66 YERMCNIL [0464] The peptide sequences in Tables 1 and 2 were selected based in part on relative fitness (RF) indices as calculated according to methods described herein. In some embodiments, a peptide according to the present invention, comprises, consists essentially of, or consists of a sequence selected from the group consisting of SEQ ID NOs: 1-68. [0465] As used herein, a peptide that "comprises" a sequence according to a specified sequence or formula can be a peptide that can include additional amino acid residues, amino acid isomers, and/or amino acid analogs at its N-terminus, C-terminus, or both. The additional residues may or may not change its activity or function, e.g., increase or decrease the activity of the peptide as compared to the activity of a peptide consisting solely of the specified sequence or formula. As used herein, a peptide that "consists essentially of a specified sequence or formula can mean that the peptide can include additional amino acid residues, amino acid isomers, and/or amino acid analogs at its N-terminus, C-terminus, or both, so long as the additional residues do not materially change its activity or function, e.g., increase or decrease the activity of the peptide as compared to the activity of a peptide consisting solely of the specified sequence or formula. As used herein, a peptide that "consists of a specified sequence or formula can mean that the peptide does not include additional amino acid residues, amino acid isomers, and/or amino acid analogs at both its N-terminus and C-terminus. [0466] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula PB2-1 (SEQ ID NO: 93): N1-L2-A3-K4-G5-E6-

K7-A8-N9-V 10-L11-112-G 13-Q14-G 15-D 16-V 17-V18-L19-V20-M2 1-K22-R23-K24 (Formula PB2-1), wherein N l is selected from the group consisting of S, I, T, Y, H, K, N, D, and analogs and isomers thereof, preferably N l is S, I, T, H, K, N, or D; L2 is selected from the group consisting of I, P, H, F, V, L, R, and analogs and isomers thereof, preferably L2 is I, F, V, L, or R; A3 is selected from the group consisting of P, V, S, E, G, A, T, and analogs and isomers thereof, preferably A3 is P, V, S, E, G, A or T; K4 is selected from the group consisting of K, R, I, T, Q, N, E, and analogs and isomers thereof, preferably K4 is K, R, I, T, Q, N or E; G5 is selected from the group consisting of R, V, E, G, A, and analogs and isomers thereof, preferably G5 is R, V, E, or A; E6 is selected from the group consisting of Q, D, V, K, G, E, A, and analogs and isomers thereof, preferably E6 is D, K, G, E, or A; K7 is selected from the group consisting of N, M, E, K, R, T, Q, and analogs and isomers thereof, preferably K7 is N, E, K, T, or Q; A8 is selected from the group consisting of G, P, D, A, T, V, S, and analogs and isomers thereof, preferably A8 is A or T; N9 is selected from the group consisting of T, H, Y, K, N, S, D, I, and analogs and isomers thereof, preferably N9 is T, K, N or S; V I0 is selected from the group consisting of E, V, L, M, G, A, and analogs and isomers thereof, preferably V I0 is E, V, L, M, or A; LI 1 is selected from the group consisting of Q, R, V, I, L, P, and analogs and isomers thereof, preferably LI 1 is Q, R, V, I, L or P; 112 is selected from the group consisting of V, L, M, S, I, T, N, F, and analogs and isomers thereof, preferably 112 is V, L, M, S, T or N; G13 is selected from the group consisting of G, A, E, V, W, R, and analogs and isomers thereof, preferably G13 is W; Q14 is selected from the group consisting of P, Q, E, H, L, R, K, and analogs and isomers thereof, preferably Q14 is K; G15 is selected from the group consisting of G, A, E, R, V, and analogs and isomers thereof, preferably G15 is E or V; D16 is selected from the group consisting of G, A, N, H, E, D, Y, V, and analogs and isomers thereof, preferably D16 is Y; V I is selected from the group consisting of A, L, M, G, E, V, I and analogs and isomers thereof, preferably V I is I, L or V; V I8 is selected from the group consisting of E, A, M, L, V, G, and analogs and isomers thereof, preferably VI8 is A, M, L or G; L19 is selected from the group consisting of S, M, V, F, L, W, and analogs and isomers thereof; V20 is selected from the group consisting of G, A, L, E, V, I, and analogs and isomers thereof, preferably V20 is G, E, V or I; M21 is selected from the group consisting of I, L, R, T, K, V, and analogs and isomers thereof, preferably M21 is I, L, R, T, K or V; K22 is selected from the group consisting of K, N, E, R, Q, T, I, and analogs and isomers thereof; R23 is selected from the group consisting of L, Q, P, W, G, R, and analogs and isomers thereof, preferably R23 is R; and K24 is selected from the group consisting of N, K, T, Q, R, E, I, and analogs and isomers thereof, preferably K24 is R. Sometimes, the peptide comprises at least 50%, 54%, 58%, 62%, 66%, 70%, 75%, 79%, 83%, 87%, 91%, 95%, or 99% sequence identity to NLAKGEKANVLIGQGDVVLVMKRK (SEQ ID NO: 3). Sometimes, the peptide comprises, consists essentially of, or consists of NLAKGEKANVLIGQGDVVLVMKRK (SEQ ID NO: 3) or NLAKGEKANVLIGQGDIVLVMKRK (SEQ ID NO: 4). In some embodiments, the peptide is about, at least, or at most, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 24 to about 50 amino acids in length, about 24 to about 40 amino acids in length, or about 24 to about 30 amino acids in length. Also provided is a peptide comprising, consisting

essentially of, or consisting of a sequence with at least 50%>, 55%>, 60%>, 65%>, 70%>, 75%>, 80%>,

85%o, 90,% , 95% , or 99%> sequence identity to a portion of a sequence according to Formula PB2-1, wherein the portion of the sequence according to Formula PB2-1 is about, or at least, 8, 9,

10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, or 23 contiguous amino acid positions of the sequence according to Formula PB2-1. [0467] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula PB2-2 (SEQ ID NO: 94): A1-Y2-M3-L4-E5-R6- E7-L8-V9-R10-K1 1 (Formula PB2-2), wherein A l is selected from the group consisting of A, G, E, and analogs and isomers thereof, preferably A l is A or E; Y2 is selected from the group consisting of F, H, N, C, D, Y, S, and analogs and isomers thereof, preferably Y2 is F, H, C, D, Y or S; M3 is selected from the group consisting of T, V, R, L, K, I, and analogs and isomers thereof, preferably M3 is T, V, L or I; L4 is selected from the group consisting of F, L, M, V, W, S, and analogs and isomers thereof, preferably L4 is L or M; E5 is selected from the group consisting of Q, V, A, G, D, E, K, and analogs and isomers thereof, preferably E5 is Q, D, E or K; R6 is selected from the group consisting of G, K, I, R, S, T, and analogs and isomers thereof, preferably R6 is R; E7 is selected from the group consisting of A, G, E, D, K, Q, V, and analogs and isomers thereof, preferably E7 is A, G, E, D or K; L8 is selected from the group consisting of V, Q, M, L, P, R, and analogs and isomers thereof, preferably L8 is L; V9 is selected from the group consisting of D, A, V, F, G, L, I, and analogs and isomers thereof, preferably V9 is A, V, F, L or I; RIO is selected from the group consisting of G, H, L, C, S, R, P, and analogs and isomers thereof, preferably RIO is G, H, S, R or P; and Kl 1 is selected from the group consisting of E, I, K, N, Q, R, T, and analogs and isomers thereof, preferably Kl 1 is E, K, N, Q, R or T. Sometimes, the peptide comprises at least 45%, 54%, 63%, 72%, 81%, 90%, or 99% sequence identity to AYMLERELVRK (SEQ ID NO: 1). Sometimes, the peptide comprises, consists essentially of, or consists of AYMLERELVRK (SEQ ID NO: 1). In some embodiments, the peptide is about, at least, or at most, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 11 to about 50 amino acids in length, about 11 to about 40 amino acids in length, about 11 to about 30 amino acids in length, or about 11 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%>, 55%>, 60%>, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula PB2-1, wherein the portion of the sequence according to Formula PB2-1 is about, or at least, 8, 9, or 10 contiguous amino acid positions of the sequence according to Formula PB2-2. [0468] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula PB2-3 (SEQ ID NO: 95): I1-L2-T3-D4-S5-Q6- T7-A8-T9-K10 (Formula PB2-3), wherein II is selected from the group consisting of L, I, T, V, R, K, M, and analogs and isomers thereof, preferably II is L or V; L2 is selected from the group consisting of R, L, H, I, F, P, V, and analogs and isomers thereof; T3 is selected from the group consisting of S, N, A, P, T, I, and analogs and isomers thereof; D4 is selected from the group consisting of Y, H, N, G, D, E, A, V, and analogs and isomers thereof, preferably D4 is Y; S5 is selected from the group consisting of G, R, T, C, I, N, S, and analogs and isomers thereof, preferably S5 is C; Q6 is selected from the group consisting of K, H, Q, P, E, L, R, and analogs and isomers thereof; T7 is selected from the group consisting of S, A, T, R, K, I, P, and analogs and isomers thereof; A8 is selected from the group consisting of T, A, E, G, P, S, V, and analogs and isomers thereof, preferably A8 is A; T9 is selected from the group consisting of I, P, A, N, S, T, and analogs and isomers thereof; and K10 is selected from the group consisting of E, T, K, N, Q, I, R, and analogs and isomers thereof, preferably K10 is T, I, R. Sometimes, the peptide comprises at least 50%, 60%, 70%, 80%, 90%, or 99% sequence identity to ILTDSQTATK (SEQ ID NO: 2). Sometimes the peptide comprises, consists essentially of, or consists of ILTDSQTATK (SEQ ID NO: 2). In some embodiments, the peptide is about, at least, or at most,

10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 10 to about 50 amino acids in length, about 10 to about 40 amino acids in length, about 10 to about 30 amino acids in length, or about 10 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula PB2-3, wherein the portion of the sequence according to Formula PB2-3 is about, or at least, 8 or 9 contiguous amino acid positions of the sequence according to Formula PB2-3. [0469] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula PB1-4 (SEQ ID NO: 96): Y1-S2-H3-G4-T5-G6- T7-G8-Y9-T10 (Formula PB1-4), wherein Y l is selected from the group consisting of N, H, D, F, C, Y, S, and analogs and isomers thereof, preferably Yl is F or Y; S2 is selected from the group consisting of I, N, C, G, S, R, T, and analogs and isomers thereof; H3 is selected from the group consisting of Y, Q, P, R, L, N, H, D, and analogs and isomers thereof, preferably H3 is R or H; G4 is selected from the group consisting of G, A, W, V, R, E, and analogs and isomers thereof, preferably G4 is G or W; T5 is selected from the group consisting of K, I, A, T, S, R, P, and analogs and isomers thereof, preferably T5 is T or S; G6 is selected from the group consisting of R, V, A, G, E, and analogs and isomers thereof, preferably G6 is V or G; T7 is selected from the group consisting of T, P, S, R, A, I, K, and analogs and isomers thereof, preferably T7 is T or A; G8 is selected from the group consisting of A, G, E, R, V, and analogs and isomers thereof, preferably G8 is G or V; Y9 is selected from the group consisting of Y, D, F, C, N, H, S, and analogs and isomers thereof, preferably Y9 is Y or H; and T10 is selected from the group consisting of P, S, T, I, N, A, and analogs and isomers thereof, preferably T10 is

T. Sometimes, the peptide comprisesat least 50%, 60%, 70%>, 80%>, 90%>, or 99% sequence identity to YSHGTGTGYT (SEQ ID NO: 5). Sometimes the peptide comprises, consists essentially of, or consists of YSHGTGTGYT (SEQ ID NO: 5) or YSHWTGTGYT (SEQ ID NO:

6). In some embodiments, the peptide is about, at least, or at most, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 10 to about 50 amino acids in length, about 10 to about 40 amino acids in length, about 10 to about 30 amino acids in length, or about 10 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula PB1-4, wherein the portion of the sequence according to Formula PB1-4 is about, or at least, 8 or 9 contiguous amino acid positions of the sequence according to Formula PB1-4. [0470] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula PB1-5 (SEQ ID NO: 97): E1-R2-G3-K4-L5-K6- R7-R8-A9-I10-A1 1-T12-P13-G14-M15-Q16 (Formula PB1-5) wherein El is present or absent, and if present El is selected from the group consisting of K, D, A, V, Q, G, E, and analogs and isomers thereof, preferably El is D or E; R2 is selected from the group consisting of T, K, I, G, S, R, and analogs and isomers thereof, preferably R2 is I or R; G3 is selected from the group consisting of W, V, R, G, E, A, and analogs and isomers thereof, preferably G3 is W, G or E; K4 is selected from the group consisting of E, M, N, K, T, Q, R, and analogs and isomers thereof, preferably K4 is K; L5 is selected from the group consisting of R, L, P, Q, V, I, and analogs and isomers thereof, preferably L5 is L; K6 is selected from the group consisting of T, R, Q, N, K, I, E, and analogs and isomers thereof, preferably K6 is T, R, or K; R7 is selected from the group consisting of L, G, R, P, Q, W, and analogs and isomers thereof, preferably R7 is R; R8 is selected from the group consisting of K, I, G, S, R, T, and analogs and isomers thereof, preferably R8 is I or R; A9 is selected from the group consisting of V, A, G, E, S, P, T, and analogs and isomers thereof, preferably A9 is A, E or S; 110 is selected from the group consisting of S, T, F, I, M, L, N, V, and analogs and isomers thereof, preferably 110 is I; A l 1 is selected from the group consisting of T, V, A, P, S, E, G, and analogs and isomers thereof, preferably A l 1 is A or S; T12 is selected from the group consisting of S, P, N, A, T, I, and analogs and isomers thereof, preferably T12 is S or T; P13 is selected from the group consisting of A, L, S, R, Q, T, P, and analogs and isomers thereof, preferably P13 is Q, T or P; G14 is selected from the group consisting of G, E, A, W, V, R, and analogs and isomers thereof, preferably G14 is G, W or V; Ml 5 is selected from the group consisting of I, K, T, V, L, R, and analogs and isomers thereof, preferably Ml 5 is I or V; and Q16 is selected from the group consisting of R, Q, P, K, H, L, E, and analogs and isomers thereof, preferably Q16 is K. Sometimes, the peptide comprises at least 50%, 56%, 62%, 68%, 75%, 81%, 87%, 93%, or 100% sequence identity to ERGKLKRRAIATPGMQ (SEQ ID NO: 57). Sometimes, the peptide comprises at least 46%, 53%, 60%, 66%, 73%, 80%, 86%, 93%, or 99% sequence identity to RGKLKRRAIATPGMQ (SEQ ID NO: 7). Sometimes the peptide comprises, consists essentially of, or consists of ERGKLKRRAIATPGMQ (SEQ ID NO: 57), RGKLKRRAIATPGMQ (SEQ ID NO: 7), or RGKLKRRAIATPWMQ (SEQ ID NO: 8). In some embodiments, the peptide is about, at least, or at most, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49,

50, or 75 amino acids in length. In some embodiments, the peptide is about 15 to about 50 amino acids in length, about 15 to about 40 amino acids in length, about 15 to about 30 amino acids in length, or about 15 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%>, 55%>, 60%>, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula PBl-5, wherein the portion of the sequence according to Formula PBl-5 is about, or at least, 8, 9, 10, 11, 12, 13, 14, or 15 contiguous amino acid positions of the sequence according to Formula PBl-5. [0471] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula PB1-6 (SEQ ID NO: 98): N1-E2-K3-K4-A5-K6- L7-A8-N9 (Formula PB1-6) wherein N l is selected from the group consisting of D, H, I, Y, K, T, S, N, and analogs and isomers thereof; E2 is selected from the group consisting of G, V, A, D, E, K, Q, and analogs and isomers thereof, preferably E2 is G, E, or Q; K3 is selected from the group consisting of R, Q, T, E, K, N, M, and analogs and isomers thereof; K4 is selected from the group consisting of T, Q, R, N, K, E, I, and analogs and isomers thereof, preferably K4 is K, E; A5 is selected from the group consisting of V, T, S, P, G, E, A, and analogs and isomers thereof, preferably A5 is E or A; K6 is selected from the group consisting of R, E, M, Q, K, N, T, and analogs and isomers thereof, preferably K6 is R or K; L7 is selected from the group consisting of S, W, V, F, M, L, and analogs and isomers thereof, preferably L7 is W, F or L; A8 is selected from the group consisting of S, P, V, T, A, G, E, and analogs and isomers thereof, preferably A8 is G or E; and N9 is selected from the group consisting of I, H, K, N, D, Y, S, T, and analogs and isomers thereof, preferably N9 is N. Sometimes, the peptide comprises at least 44%, 55%, 66%, 77%, 88%, or 99% sequence identity to NEKKAKLAN (SEQ ID NO: 18). Sometimes the peptide comprises, consists essentially of, or consists of NEKKAKLAN (SEQ ID NO: 18). In some embodiments, the peptide is about, at least, or at most, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 9 to about 50 amino acids in length, about 9 to about 40 amino acids in length, about 9 to about 30 amino acids in length, or about 9 to about 20 amino acids in length. Also provided is peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula PB1-6, wherein the portion of the sequence according to Formula PB1-6 is about, or at least, 8 contiguous amino acid positions of the sequence according to Formula PB1-6. [0472] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula PB1-7 (SEQ ID NO: 99): S1-P2-G3-M4-M5- M6-G7-M8-F9-N10 (Formula PB1-7) wherein SI is selected from the group consisting of G, R, T, S, C, N, I, and analogs and isomers thereof, preferably SI is S; P2 is selected from the group consisting of A, L, H, T, R, S, P, and analogs and isomers thereof, preferably P2 is P; G3 is selected from the group consisting of G, A, E, R, V, and analogs and isomers thereof; M4 is selected from the group consisting of M, T, I, V, L, R, K, and analogs and isomers thereof, preferably M4 is I or V; M5 is selected from the group consisting of L, I, K, V, R, T, and analogs and isomers thereof, preferably M5 is I; M6 is selected from the group consisting of L, I, K, R, V, T, and analogs and isomers thereof, preferably M6 is I; G7 is selected from the group consisting of G, C, S, R, A, V, D, and analogs and isomers thereof; M8 is selected from the group consisting of L, T, R, I, V, K, and analogs and isomers thereof; F9 is selected from the group consisting of S, Y, F, C, I, V, L, and analogs and isomers thereof, preferably F9 is F or L; and N10 is present or absent, and if present N10 is selected from the group consisting of S, T, Y,

D, K, I, H, N, and analogs and isomers thereof. Sometimes, the peptide comprises at least 50%>, 60%, 70%, 80%, 90%, 99% sequence identity to SPGMMMGMFN (SEQ ID NO: 19). Sometimes the peptide comprises, consists essentially of, or consists of SPGMMMGMFN (SEQ ID NO: 19), SPGVMMGMFN (SEQ ID NO: 20), or SPGMMMGMF (SEQ ID NO: 56). In

some embodiments, the peptide is about, at least, or at most, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 10 to about 50 amino acids in length, about 10 to about 40 amino acids in length, about 10 to about 30 amino acids in length, or about 10 to about 20 amino acids in length. Also provided is

a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%>, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula PB1-7, wherein the portion of the sequence according to Formula PB1-7 is about, or at least, 8 or 9 contiguous amino acid positions of the sequence according to Formula PB1-7.

[0473] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula PBl-8 (SEQ ID NO: 100): G1-L2-Q3-S4-S5- D6-D7-F8-A9-L 10-11 1 (Formula PBl-8) wherein G l is selected from the group consisting of G, C, A, R, S, V, D, and analogs and isomers thereof, preferably G l is G or C; L2 is selected from the group consisting of R, P, V, F, H, L, I, and analogs and isomers thereof, preferably L2 is V , L or I; Q3 is selected from the group consisting of P, E, R, Q, L, K, H, and analogs and isomers thereof; S4 is selected from the group consisting of Y, S, P, T, C, A, F, and analogs and isomers thereof, preferably S4 is Y, S or A; S5 is selected from the group consisting of T, S, C, Y , A , F, P, and analogs and isomers thereof, preferably S5 is S; D6 is selected from the group consisting of N , G, D, E, A, H, V, Y, and analogs and isomers thereof, preferably D6 is Y; D7 is selected from the group consisting of D, A, G, E, H, N , V, Y, and analogs and isomers thereof, preferably D7 is A , G or Y ; F8 is selected from the group consisting of I, C, S, L, F, V, Y, and analogs and isomers thereof; A9 is selected from the group consisting of D, G, A, S, T P, V and analogs and isomers thereof, preferably A9 is A, S or V; L10 is selected from the group consisting of R, P, V, L, M, Q, and analogs and isomers thereof, preferably L10 is L or M; and I I 1 is selected from the group consisting of I, F, T, L, V, S, M, N , and analogs and isomers thereof, preferably II 1 is L or M . Sometimes, the peptide comprises at least 45%, 54%, 63%, 72%, 81%, 90%, or 99% sequence identity to GLQSSDDFALI (SEQ ID NO: 9). Sometimes the peptide comprises, consists essentially of, or consists of GLQSSDDFALI (SEQ ID NO: 9). In some embodiments, the peptide is about, at least, or at most, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 11 to about 50 amino acids in length, about 11 to about 40 amino acids in length, about 11 to about 30 amino acids in length, or about 11 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%>, 55%>, 60%>, 65%>, 70%>, 75%>, 80%>,

85%o, 90,% , 95% , or 99%> sequence identity to a portion of a sequence according to Formula PBl-8, wherein the portion of the sequence according to Formula PBl-8 is about, or at least, 8, 9,

10, 11, or 12 contiguous amino acid positions of the sequence according to Formula PBl-8.

[0474] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula PB1-9 (SEQ ID NO: 101): K1-L2-L3-G4-I5-N6- M7-S8-K9 (Formula PB1-9) wherein K l is selected from the group consisting of T, R, Q, N , M, K, E, and analogs and isomers thereof, preferably K l is R; L2 is selected from the group consisting of R, P, V, Q, I, L, and analogs and isomers thereof, preferably L2 is L; L3 is selected from the group consisting of R, F, H, V, P, L, I, and analogs and isomers thereof, preferably L3 is L or I; G4 is selected from the group consisting of E, A, R, V, G, and analogs and isomers thereof, preferably G4 is E, V or G; 15 is selected from the group consisting of F, L, M, T, S, N,

I, V, and analogs and isomers thereof, preferably 15 is L or M; N6 is selected from the group consisting of Y, D, I, T, S, N, K, H, and analogs and isomers thereof, preferably N6 is T or N; M7 is selected from the group consisting of R, V, T, K, I, L, and analogs and isomers thereof, preferably M7 is V or I; S8 is selected from the group consisting of T, R, S, N, I, G, C, and analogs and isomers thereof, preferably S8 is S; and K9 is selected from the group consisting of E, N, M, K, T, R, Q, and analogs and isomers thereof, preferably K9 is K. Sometimes, the peptide comprises at least 44%, 55%, 66%, 77%>, 88%>, or 99%> sequence identity to KLLGINMSK (SEQ ID NO: 10). Sometimes a peptide comprises, consists essentially of, or consists of KLLGINMSK (SEQ ID NO: 10) or KLVGINMSK (SEQ ID NO: 11). In some embodiments, the peptide is about, at least, or at most, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 10 to about 50 amino acids in length, about 10 to about 40 amino acids in length, about 10 to about 30 amino acids in length, or about 10 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%>, 55%>, 60%>, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula PBl-9, wherein the portion of the sequence according to Formula PBl-9 is about, or at least, 8 contiguous amino acid positions of the sequence according to Formula PB1- 9. [0475] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula PBl-10 (SEQ ID NO: 102): N1-R2-T3-G4-T5- F6-E7-F8 (Formula PBl-10) wherein N l is selected from the group consisting of T, S, N, H, K, D, Y, I, and analogs and isomers thereof, preferably N l is T or N; R2 is selected from the group consisting of T, K, G, I, S, R, and analogs and isomers thereof, preferably R2 is I or R; T3 is selected from the group consisting of A, I, K, P, R, S, T, and analogs and isomers thereof, preferably T3 is S or T; G4 is selected from the group consisting of D, V, R, S, A, G, C, and analogs and isomers thereof, preferably G4 is G or C; T5 is selected from the group consisting of A, T, R, S, K, I, P, and analogs and isomers thereof, preferably T5 is T or K; F6 is selected from the group consisting of I, V, S, Y, F, C, L, and analogs and isomers thereof, preferably F6 is F; E7 is selected from the group consisting of V, K, G, D, E, A, Q, and analogs and isomers thereof, preferably E7 is E; and F8 is selected from the group consisting of Y, V, S, L, I, F, C, and analogs and isomers thereof, preferably F8 is L or F. Sometimes, the peptide comprises at least 50%, 62%, 75%, 87%, or 99% sequence identity to NRTGTFEF (SEQ ID NO: 12). Sometimes the peptide comprises, consists essentially of, or consists of NRTGTFEF (SEQ ID NO: 12) or NKTGTFEF (SEQ ID NO: 13). In some embodiments, the peptide is about, at least,

or at most, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 8 to about 50 amino acids in length, about 8 to about 40 amino acids in length, about 8 to about 30 amino acids in length, or about 8 to about 20 amino acids in length. [0476] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula PBl-1 1 (SEQ ID NO: 103): N1-E2-S3-A4-D5- M6-S7-I8-G9-V10-T1 1-VI 2-1 13 (Formula PBl-1 1) wherein N l is selected from the group consisting of S, T, Y, D, I, H, K, N, and analogs and isomers thereof, preferably N l is D or N; E2 is selected from the group consisting of Q, K, A, E, D, G, V, and analogs and isomers thereof, preferably E2 is E; S3 is selected from the group consisting of Y, S, A, F, P, T, C, and analogs and isomers thereof, preferably S3 is Y, S or P; A4 is selected from the group consisting of P, E, V, T, S, A, G, and analogs and isomers thereof, preferably A4 is V or A; D5 is selected from the group consisting of E, H, G, D, Y, N, A, V, and analogs and isomers thereof, preferably D5 is G, D or Y; M6 is selected from the group consisting of T, R, K, I, L, V, and analogs and isomers thereof, preferably M6 is I; S7 is selected from the group consisting of N, R, G, C, I, T, S, and

analogs and isomers thereof, preferably S7 is G; 18 is selected from the group consisting of S, V,

T, F, I, N, L, M, and analogs and isomers thereof, preferably 18 is V; G9 is selected from the group consisting of E, G, R, V, A, and analogs and isomers thereof, preferably G9 is G or V; V10 is selected from the group consisting of A, G, F, D, V, I, L, and analogs and isomers thereof, preferably V10 is V or I; T l 1 is selected from the group consisting of S, I, T, P, N, A, and analogs and isomers thereof, preferably T i l is S, T or A; V12 is selected from the group consisting of A, G, F, V, D, I, L, and analogs and isomers thereof, preferably V12 is A or V; and 113 is selected from the group consisting of F, I, T, L, S, M, V, N, and analogs and isomers

thereof, preferably 113 is I or L. Sometimes, the peptide comprises at least 45%>, 53%>, 61%>, 69%, 76%, 84%, 92%, or 99% sequence identity to NESADMSIGVTVI (SEQ ID NO: 14). Sometimes the peptide comprises, consists essentially of, or consists of NESADMSIGVTVI (SEQ ID NO: 14) or NESADMGIGVTVI (SEQ ID NO: 15). In some embodiments, the peptide

is about, at least, or at most, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75

amino acids in length. In some embodiments, the peptide is about 13 to about 50 amino acids in length, about 13 to about 40 amino acids in length, about 13 to about 30 amino acids in length, or about 13 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a portion of a sequence according to Formula PBl-1 1, wherein the portion of the sequence according to Formula PBl-1 1 is about, or at least, 8, 9, 10, 11, or 12 contiguous amino acid positions of the sequence according to Formula PBl-1 1. [0477] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula PB1-12 (SEQ ID NO: 104): L1-G2-P3-A4-T5- A6-Q7-M8-A9-L10-Q1 1-L12-F13-I14-K15 (Formula PB1-12) wherein LI is selected from the group consisting of L, H, F, V, R, P, I, and analogs and isomers thereof, preferably L I is L or I; G2 is selected from the group consisting of G, S, R, A, C, D, V, and analogs and isomers thereof, preferably G2 is G or C; P3 is selected from the group consisting of P, A, L, T, Q, S, R, and analogs and isomers thereof, preferably P3 is P or T; A4 is selected from the group consisting of P, A, G, T, V, S, E, and analogs and isomers thereof, preferably A4 is A, S or E; T5 is selected from the group consisting of S, P, N, I, A, T, and analogs and isomers thereof, preferably T5 is T; A6 is selected from the group consisting of D, A, T, S, G, V, P, and analogs and isomers thereof, preferably A6 is A or T; Q7 is selected from the group consisting of R, H, P, Q, E, K, L, and analogs and isomers thereof, preferably Q7 is Q or K; M8 is selected from the group consisting of T, V, I, R, L, K, and analogs and isomers thereof, preferably M8 is I or L; A9 is selected from the group consisting of S, T, A, P, V, G, D, and analogs and isomers thereof, preferably A9 is S; L10 is selected from the group consisting of V, L, F, H, R, I, P, and analogs and isomers thereof, preferably L10 is I; Ql 1 is selected from the group consisting of K, Q, H, P, R, L, E, and analogs and isomers thereof, preferably Ql 1 is K or Q; L12 is selected from the group consisting of M, L, V, R, P, Q, and analogs and isomers thereof, preferably L12 is L; F13 is selected from the group consisting of V, C, I, F, L, S, Y, and analogs and isomers thereof, preferably F13 is L; 114 is selected from the group consisting of T, L, N, I, M, F, V, S, and analogs and isomers thereof, preferably 114 is I or V; and K15 is selected from the group consisting of N, E, K, R, T, Q, I, and analogs and isomers thereof, preferably K15 is K. Sometimes, the peptide comprises at least 46%, 53%, 60%, 66%, 73%, 80%, 86%, 93%, or 99% sequence identity to LGPATAQMALQLFIK (SEQ ID NO: 16). Sometimes the peptide comprises, consists essentially of, or consists of is LGPATAQMALQLFIK (SEQ ID NO: 16) or LCPATAQMALQLFIK (SEQ ID NO: 17). In some embodiments, the peptide is about, at least, or at most, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 15 to about 50 amino acids in length, about 15 to about 40 amino acids in length, about 15 to about 30 amino acids in length, or about 15 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula PB1-12, wherein the portion of the sequence according to Formula PB1-12 is about, or at least, 8, 9, 10, 11, 12, 13, or 14 contiguous amino acid positions of the sequence according to Formula PB1-12. [0478] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula PA-13 (SEQ ID NO: 105): E1-M2-A3-T4-K5- A6-D7-Y8 (Formula PA-13) wherein El is selected from the group consisting of K, D, E, V, Q, G, A, and analogs and isomers thereof; M2 is selected from the group consisting of T, V, R, L, I, K, and analogs and isomers thereof, preferably M2 is R; A3 is selected from the group consisting of P, S, T, V, A, D, G, and analogs and isomers thereof; T4 is selected from the group consisting of A, K, I, T, R, S, P, and analogs and isomers thereof, preferably T4 is S; K5 is selected from the group consisting of Q, E, K, M, N, R, T, and analogs and isomers thereof, preferably K5 is N or T; A6 is selected from the group consisting of T, V, P, S, D, G, A, and analogs and isomers thereof; D7 is selected from the group consisting of Y, V, N, H, D, E, G, A, and analogs and isomers thereof, preferably D7 is Y; and Y8 is selected from the group consisting of N, H, F, D, C, Y, S, and analogs and isomers thereof, preferably Y8 is D.

Sometimes, the peptide comprises at least 50%>, 62%, 75%, 87%, or 99% sequence identity to EMATKADY (SEQ ID NO: 21). Sometimes the peptide comprises, consists essentially of, or consists of EMATKADY (SEQ ID NO: 21) or EMATTADY (SEQ ID NO: 22). In some embodiments, the peptide is about, at least, or at most, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 8 to about 50 amino acids in length, about 8 to about 40 amino acids in length, about 8 to about 30 amino acids in length, or about 8 to about 20 amino acids in length.

[0479] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula HA- 14 (SEQ ID NO: 106): Y1-H2-A3-N4-N5- S6-T7-D8-T9-V10-D1 1-T12-I13-L14-E15-K16-N17-V18 (Formula HA- 14) wherein Y l is selected from the group consisting of Y, S, H, N, C, D, F, and analogs and isomers thereof, preferably Y l is Y, S or F; H2 is selected from the group consisting of D, L, N, H, Q, P, R, Y, and analogs and isomers thereof, preferably H2 is L, N, H, or Q; A3 is selected from the group consisting of A, G, S, P, V, T, E, and analogs and isomers thereof, preferably A3 is A; N4 is selected from the group consisting of Y, S, T, K, N, D, I, H, and analogs and isomers thereof, preferably N4 is N; N5 is selected from the group consisting of T, S, Y, D, N, H, I, K, and analogs and isomers thereof, preferably N5 is N; S6 is selected from the group consisting of L, A, T, P, S, and analogs and isomers thereof, preferably S6 is S; T7 is selected from the group consisting of S, P, T, A, I, N, and analogs and isomers thereof, preferably T7 is T or N; D8 is selected from the group consisting of V, E, A, D, G, H, N, Y, and analogs and isomers thereof, preferably D8 is E, D or Y; T9 is selected from the group consisting of P, S, T, I, N, A, and analogs and isomers thereof, preferably T9 is S or T; V I0 is selected from the group consisting of I, V, D, F, G, A, L, and analogs and isomers thereof, preferably V I0 is I or V; Dl 1 is selected from the group consisting of Y, V, N, H, G, E, D, A, and analogs and isomers thereof, preferably Dl 1 is Y or D; T12 is selected from the group consisting of I, K, A, T, P, R, S, and analogs and

isomers thereof, preferably T12 is T; 113 is selected from the group consisting of L, M, K, I, V, T, R, and analogs and isomers thereof, preferably 113 is I; L14 is selected from the group consisting of R, P, V, F, H, I, L, and analogs and isomers thereof, preferably L14 is I or L; E15 is selected from the group consisting of A, D, E, G, K, Q, V, and analogs and isomers thereof, preferably El 5 is D, E or K; K16 is selected from the group consisting of K, M, N, E, Q, R, T, and analogs and isomers thereof, preferably K16 is K or N; N17 is selected from the group consisting of Y, S, T, K, H, N, D, I, and analogs and isomers thereof, preferably N17 is H or D; and V I8 is selected from the group consisting of M, V, L, E, G, A, and analogs and isomers

thereof, preferably V18 is V. Sometimes, the peptide comprises at least 50%, 55%, 61%>, 66%, 72%, 77%, 83%, 88%, 94%, , or 99% sequence identity to YHANNSTDTVDTILEKNV (SEQ ID NO: 23). Sometimes the peptide comprises, consists essentially of, or consists of YHANNSTDTVDTILEKNV (SEQ ID NO: 23), YHSNNSTDTVDTILEKNV (SEQ ID NO: 24), or YHANNSTDTVDTILEQNV (SEQ ID NO: 25). In some embodiments, the peptide is about, at least, or at most, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 18 to about 50 amino acids in length, about 18 to about 40 amino acids in length, about 18 to about 30 amino acids in length, or about 18 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula HA- 13, wherein the portion of the sequence

according to Formula HA-13 is about, or at least, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 contiguous amino acid positions of the sequence according to Formula HA-13. [0480] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula NP-15 (SEQ ID NO: 107): W1-H2-S3-N4-L5- N6-D7-A8-T9-Y10-Q1 1-R12-T13-R14-A15-L16-V17 (Formula NP-15) wherein W l is selected from the group consisting of S, R, L, C, G, and analogs and isomers thereof, preferably W l is L; H2 is selected from the group consisting of R, P, Q, Y, D, H, N, L, and analogs and isomers thereof, preferably H2 is H; S3 is selected from the group consisting of Y, S, P, T, C, A, F, and analogs and isomers thereof, preferably S3 is Y or S; N4 is selected from the group consisting of T, I, S, Y, D, K, H, N, and analogs and isomers thereof; L5 is selected from the group consisting of V, S, W, F, M, L, and analogs and isomers thereof, preferably L5 is F or L; N6 is selected from the group consisting of T, S, Y, D, N, H, I, K, and analogs and isomers thereof; D7 is selected from the group consisting of V, Y, A, E, D, G, H, N, and analogs and isomers thereof, preferably D7 is Y or D; A8 is selected from the group consisting of T, V, P, S, E, G, A, and analogs and isomers thereof, preferably A8 is S; T9 is selected from the group consisting of N, I, A, T, S, P, and analogs and isomers thereof, preferably T9 is T; Y10 is selected from the group consisting of F, D, C, N, H, S, Y, and analogs and isomers thereof; Ql 1 is selected from the group consisting of H, K, L, E, P, Q, R, and analogs and isomers thereof, preferably Ql 1 is Q; R12 is selected from the group consisting of T, W, S, R, G, M, K, and analogs and isomers thereof; T13 is selected from the group consisting of P, R, S, T, A, I, K, and analogs and isomers thereof, preferably T13 is K; R14 is selected from the group consisting of S, R, T, G, K, I, and analogs and isomers thereof, preferably R14 is R; A15 is selected from the group consisting of A, G, D, S, P, V, T, and analogs and isomers thereof, preferably A15 is A or S; L16 is selected from the group consisting of R, P, V, I, H, L, F, and analogs and isomers thereof, preferably LI6 is I or L; and V17 is selected from the group consisting of V, F, G, D, A, L, I, and analogs and isomers thereof, preferably V17 is V. Sometimes, the peptide comprises at least 47%, 53%, 58%, 64%, 70%, 76%, 82%, 88%, 94%, or 99% sequence identity to WHSNLNDATYQRTRALV (SEQ ID NO: 33). Sometimes the peptide comprises, consists essentially of, or consists of WHSNLNDATYQRTRALV (SEQ ID NO: 33), WHSNLNDSTYQRTRALV (SEQ ID NO: 34), WHSNLNDATYQRTRSLV (SEQ ID NO: 35), WHSNLNDTTYQRTRALV (SEQ ID NO: 36), or WHSNLNDTTYQRTRSLV (SEQ ID NO: 37). In some embodiments, the peptide is about, at least, or at most, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 17 to about 50 amino acids in length, about 17 to about 40 amino acids in length, about 17 to about 30 amino acids in length, or about 17 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%>, 55%>, 60%>, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula NP-15, wherein the portion of the sequence according to Formula NP-15 is about, or at least, 8, 9, 10, 11, 12, 13, 14, 15, or 16 contiguous amino acid positions of the sequence according to Formula NP-15. [0481] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula NP-16 (SEQ ID NO: 108): K1-R2-G3-I4-N5- D6-R7-N8-F9-W10 (Formula NP-16) wherein Kl is present or absent, and if present Kl is selected from the group consisting of K, N, I, E, T, Q, R, and analogs and isomers thereof, preferably Kl is R; R2 is selected from the group consisting of S, H, R, P, G, C, L, and analogs and isomers thereof, preferably R2 is R; G3 is selected from the group consisting of R, V, W, A,

E, G, and analogs and isomers thereof, preferably G3 is W or G; 14 is selected from the group

consisting of T, V, S, F, M, L, N, I, and analogs and isomers thereof, preferably 14 is T, M or I; N5 is selected from the group consisting of K, I, H, N, D, Y, S, T, and analogs and isomers thereof, preferably N5 is N, S or T; D6 is selected from the group consisting of A, G, Y, H, D, E, N, V, and analogs and isomers thereof, preferably D6 is Y or D; R7 is selected from the group consisting of G, L, W, Q, P, R, and analogs and isomers thereof, preferably R7 is R; N8 is selected from the group consisting of D, S, T, Y, K H, I, N, and analogs and isomers thereof, preferably N8 is N; F9 is selected from the group consisting of I, V, S, Y, F, C, L, and analogs and isomers thereof, preferably F9 is L; and W10 is selected from the group consisting of L, S, G, C, R, and analogs and isomers thereof. Sometimes, the peptide comprises at least 50%, 60%, 70%, 80%, 90%, or 99% sequence identity to KRGINDRNFW (SEQ ID NO: 68). In some

embodiments, the peptide comprises at least 44%>, 55%, 66%, 77%>, 88%>, or 99%> seuqence identity to RGINDRNFW (SEQ ID NO: 26). Sometimes the peptide comprises, consists essentially of, or consists of KRGINDRNFW (SEQ ID NO: 68), RGINDRNFW (SEQ ID NO: 26), or RWINDRNFW (SEQ ID NO: 27). In some embodiments, the peptide is about, at least,

or at most, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 18 to about 50 amino acids in length, about 18 to about 40 amino acids in length, about 18 to about 30 amino acids in length, or about 18 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90,%, 95% , or 99% sequence identity to a portion of a sequence according to Formula NP-16, wherein the portion of the sequence according to Formula NP-16 is about, or at least, 8 or 9 contiguous amino acid positions of the sequence according to Formula NP-16. [0482] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula NP-17 (SEQ ID NO: 109): A1-Y2-E3-R4-M5- C6-N7-I8-L9-K10-G1 1-K12-F13-Q14-T15 (Formula NP-17) wherein Al is present or absent, and if present A l is selected from the group consisting of V, P, A, T, S, G, D, and analogs and isomers thereof, preferably A l is A; Y2 is selected from the group consisting of H, S, C, D, F, N, Y, and analogs and isomers thereof, preferably Y2 is S; E3 is selected from the group consisting of A, D, G, K, Q, V, E, and analogs and isomers thereof, preferably E3 is E; R4 is selected from the group consisting of G, I, K, T, R, S, and analogs and isomers thereof, preferably R4 is I or K; M5 is selected from the group consisting of R, V, T, K, I, L, and analogs and isomers thereof, preferably M5 is I; C6 is selected from the group consisting of C, G, F, Y, S, R, W, and analogs and isomers thereof, preferably C6 is C; N7 is selected from the group consisting of Y, I, N, D,

S, K, H, T, and analogs and isomers thereof, preferably N7 is K; 18 is selected from the group consisting of I, F, S, N, M, L, T, V, and analogs and isomers thereof, preferably 18 is V; L9 is selected from the group consisting of V, R, P, L, H, F, I, and analogs and isomers thereof, preferably L9 is L or I; K10 is present or absent, and if present, K10 is selected from the group consisting of E, N, K, I, T, R, Q, and analogs and isomers thereof, preferably K10 is K; Gl 1 is present or absent, and if present Gi l selected from the group consisting of G, W, A, E, R, V, and analogs and isomers thereof, preferably Gi l is G or V; K12 is present or absent, and if present K12 is selected from the group consisting of I, K, E, T, Q, R, N, and analogs and isomers thereof; F13 is present or absent, and if present F13 is selected from the group consisting of Y, V, S, L, I, C, F, and analogs and isomers thereof, preferably F13 is L; Q14 is present or absent, and if present, Q14 is selected from the group consisting of E, L, K, H, R, P, Q, and analogs and isomers thereof, preferably Q14 is K or Q; and T15 is present or absent, and if present T15 is selected from the group consisting of K, I, A, T, R, S, P, and analogs and isomers thereof, preferably T15 is I, T or S. Sometimes, the peptide comprises at least 46%, 53%, 60%, 66%, 73%, 80%, 86%, 93%, or 99% sequence identity to AYERMCNILKGKEQT (SEQ ID NO: 38). Sometimes the peptide comprises, consists essentially of, or consists of AYERMCNILKGKEQT (SEQ ID NO: 38), YERMCNIL (SEQ ID NO: 66), or AYERMCNILKGKF (SEQ ID NO: 67). In some embodiments, the peptide is about, at least, or at most, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length.

In some embodiments, the peptide is about 15 to about 50 amino acids in length, about 15 to about 40 amino acids in length, about 15 to about 30 amino acids in length, or about 15 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula NP-17, wherein the portion of the sequence according to Formula NP-17 is about, or at least, 8, 9, 10, 11, 12, 13, or 14 contiguous amino acid positions of the sequence according to Formula NP-17. [0483] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula NP-18 (SEQ ID NO: 110): E1-L2-R3-S4-R5- Y6-W7-A8 (Formula NP-18) wherein El is selected from the group consisting of Q, V, D, E, K, A, G, and analogs and isomers thereof, preferably El is Q; L2 is selected from the group consisting of V, Q, P, M, L, R, and analogs and isomers thereof, preferably L2 is M or L; R3 is selected from the group consisting of G, I, K, R, T, S, and analogs and isomers thereof, preferably R3 is R; S4 is selected from the group consisting of G, C, N, I, T, R, S, and analogs and isomers thereof, preferably S4 is S; R5 is selected from the group consisting of T, K, R, S, G, I, and analogs and isomers thereof, preferably R5 is K, R or I; Y6 is selected from the group consisting of H, C, F, D, Y, N, S, and analogs and isomers thereof, preferably Y6 is H or Y; W7 is selected from the group consisting of C, S, G, L, R, and analogs and isomers thereof, preferably W7 is L; and A8 is selected from the group consisting of P, A, D, G, S, T, V, and analogs and isomers thereof, preferably A8 is A. Sometimes, the peptide comprises at least 50%, 62%, 75%, 87%, or 99% sequence identity to ELRSRYWA (SEQ ID NO: 28). Sometimes the peptide comprises, consists essentially of, or consists of ELRSRYWA (SEQ ID NO: 28) or ELRSRHWA (SEQ ID NO: 29). In some embodiments, the peptide is about, at least, or at most,

8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 8 to about 50 amino acids in length, about 8 to about 40 amino acids in length, about 8 to about 30 amino acids in length, or about 8 to about 20 amino acids in length. [0484] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula NP-19 (SEQ ID NO: 111): S1-F2-Q3-G4-R5- G6-V7-F8-E9-L10 (Formula NP-19) wherein SI is selected from the group consisting of A, F, T, Y, C, S, P, and analogs and isomers thereof, preferably SI is S; F2 is selected from the group consisting of F, I, S, L, V, Y, C, and analogs and isomers thereof; Q3 is selected from the group consisting of L, H, E, R, Q, K, P, and analogs and isomers thereof, preferably Q3 is H or Q; G4 is selected from the group consisting of R, V, A, E, G, W, and analogs and isomers thereof, preferably G4 is R or W; R5 is selected from the group consisting of G, Q, L, W, R, P, and analogs and isomers thereof, preferably R5 is R or P; G6 is selected from the group consisting of A, G, E, R, V, and analogs and isomers thereof, preferably G6 is V; V7 is selected from the group consisting of I, L, A, F, V, G, D, and analogs and isomers thereof, preferably V7 is L or V; F8 is selected from the group consisting of S, V, L, I, C, Y, F, and analogs and isomers thereof, preferably F8 is F; E9 is selected from the group consisting of E, D, K, Q, A, V, G, and analogs and isomers thereof; and L10 is selected from the group consisting of P, I, H, L, F, R, V, and analogs and isomers thereof. Sometimes, the peptide comprises at least 50%>, 60%>, 70%>, 80%>, 90%, or 99% sequence identity to SFQGRGVFEL (SEQ ID NO: 30). Sometimes the peptide comprises, consists essentially of, or consists of SFQGRGVFEL (SEQ ID NO: 30) or SFQGRGVFEF (SEQ ID NO: 31). In some embodiments, the peptide is about, at least, or at most, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 10 to about 50 amino acids in length, about 10 to about 40 amino acids in length, about 10 to about 30 amino acids in length, or about 10 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula NP-19, wherein the portion of the sequence according to Formula NP-19 is about, or at least, 8 or 9 contiguous amino acid positions of the sequence according to Formula NP-19. [0485] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula NP-20 (SEQ ID NO: 112): L1-S2-T3-R4-G5- V6-Q7-I8 (Formula NP-20) wherein LI is selected from the group consisting of V, P, R, L, I, H, F, and analogs and isomers thereof, preferably LI is L or I; S2 is selected from the group consisting of F, C, Y, T, S, A, P, and analogs and isomers thereof, preferably S2 is C, Y, T, S or A; T3 is selected from the group consisting of A, N, T, S, P, I, and analogs and isomers thereof, preferably T3 is T or S; R4 is selected from the group consisting of I, K, G, S, R, T, and analogs and isomers thereof; G5 is selected from the group consisting of R, A, G, E, V, and analogs and isomers thereof, preferably G5 is V; V6 is selected from the group consisting of F, V, D, G, I, L, A, and analogs and isomers thereof, preferably V6 is F, V or A; Q7 is selected from the group consisting of Q, L, E, K, H, R, P, and analogs and isomers thereof, preferably Q7 is Q, K or P; and 18 is selected from the group consisting of S, T, V, I, L, M, N, F, and analogs and isomers thereof, preferably 18 is V, I or L. Sometimes, the peptide comprises at least 50%>, 62%>, 75%>, 87%, or 99% sequence identity to LSTRGVQI (SEQ ID NO: 32). Sometimes the peptide comprises, consists essentially of, or consists of LSTRGVQI (SEQ ID NO: 32). In some embodiments, the peptide is about, at least, or at most, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 8 to about 50 amino acids in length, about 8 to about 40 amino acids in length, about 8 to about 30 amino acids in length, or about 8 to about 20 amino acids in length. [0486] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula NA-21 (SEQ ID NO: 113): G1-P2-D3-D4-G5- A6-V7-A8-V9-L10-K1 1-Y12 (Formula NA-21) wherein Gl is present or absent, and if present, Gl is selected from the group consisting of C, A, G, D, R, S, V, and analogs and isomers thereof, preferably Gl is C or G; P2 is present or absent, and if present P2 is selected from the group consisting of L, A, T, P, Q, R, S, and analogs and isomers thereof, preferably P2 is T, P or Q; D3 is present or absent, and if present D3 is selected from the group consisting of Y, V, N, H, G, E, D, A, and analogs and isomers thereof; D4 is present or absent, and if present D4 is selected from the group consisting of N, H, G, D, E, A, Y, V, and analogs and isomers thereof, preferably D4 is D or Y; G5 is selected from the group consisting of V, R, G, E, A, and analogs and isomers thereof, preferably G5 is V or G; A6 is selected from the group consisting of P, S, T, V, A, E, G, and analogs and isomers thereof, preferably A6 is A; V7 is selected from the group consisting of V, M, L, A, E, G, and analogs and isomers thereof; A8 is selected from the group consisting of S, P. V, T, A, G, D, and analogs and isomers thereof; V9 is selected from the group consisting of I, L, A, G, E, V, and analogs and isomers thereof; L10 is selected from the group consisting of F, I, L, S, V, and analogs and isomers thereof, preferably L10 is I, L or V; Kl 1 is selected from the group consisting of R, Q, T, K, I, N, E, and analogs and isomers thereof; and Y12 is selected from the group consisting of C, D, F, H, N, S, Y, and analogs and isomers thereof, preferably Y12 is Y. Sometimes, the peptide comprises at least 50%, 58%, 66%, 75%, 83%, 91%, or 99% sequence identity to GPDDGAVAVLKY (SEQ ID NO: 45). Sometimes the peptide comprises, consists essentially of, or consists of GPDDGAVAVLKY (SEQ ID NO: 45), GPDNGAVAVLKY (SEQ ID NO: 46), or GAVAVLKY (SEQ ID NO: 62). In some embodiments, the peptide is about, at least, or at most, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 12 to about 50 amino acids in length, about 12 to about 40 amino acids in length, about 12 to about 30 amino acids in length, or about 12 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%>, 55%>, 60%>, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula NA-21, wherein the portion of the sequence according to Formula NA-21 is about, or at least, 8, 9, or 10 contiguous amino acid positions of the sequence according to Formula NA-21. [0487] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula NA-22 (SEQ ID NO: 114): F1-V2-I3-R4-E5-P6- F7-I8 (Formula NA-22) wherein Fl is selected from the group consisting of L, I, F, C, Y, V, S, and analogs and isomers thereof, preferably Fl is F; V2 is selected from the group consisting of

D, F, G, A, L, I, V, and analogs and isomers thereof, preferably V2 is F or V; 13 is selected from the group consisting of T, V, R, L, M, I, K, and analogs and isomers thereof, preferably 13 is T or R; R4 is selected from the group consisting of R, T, G, K, I, S, and analogs and isomers thereof, preferably R4 is I; E5 is selected from the group consisting of A, G, D, E, K, Q, V, and analogs and isomers thereof, preferably E5 is D or E; P6 is selected from the group consisting of H, L, A, R, S, P, T, and analogs and isomers thereof, preferably P6 is P or T; F7 is selected from the group consisting of Y, V, S, L, I, F. C, and analogs and isomers thereof, preferably F7 is S, L or F; and 18 is selected from the group consisting of V, N, S, T, F, I, M, L, and analogs and isomers thereof, preferably 18 is V or I. Sometimes, the peptide comprises at least 50%, 62%, 75%, 87%, or 99% sequence identity to FVIREPFI (SEQ ID NO: 39). Sometimes the peptide comprises, consists essentially of, or consists of FVIREPFI (SEQ ID NO: 39). In some embodiments, the peptide is about, at least, or at most, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 8 to about 50 amino acids in length, about 8 to about 40 amino acids in length, about 8 to about 30 amino acids in length, or about 8 to about 20 amino acids in length. [0488] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula NA-23 (SEQ ID NO: 115): G1-E2-A3-P4-S5- P6-Y7-N8-S9-R10-F1 1-E12-S13-V14-A15-W16 (Formula NA-23) wherein Gl is selected from the group consisting of V, S, R, D, G, A, C, and analogs and isomers thereof, preferably Gl is G; E2 is selected from the group consisting of V, Q, K, D, E, G, A, and analogs and isomers thereof, preferably E2 is K or E; A3 is selected from the group consisting of G, D, A, V, T, S, P, and analogs and isomers thereof, preferably A3 is D; P4 is selected from the group consisting of T, P, Q, R, S, L, A, and analogs and isomers thereof, preferably P4 is T or P; S5 is selected from the group consisting of Y, S, P, T, C, A, F, and analogs and isomers thereof, preferably S5 is S; P6 is selected from the group consisting of T, R, S, Q, L, P, A, and analogs and isomers thereof, preferably P6 is T, Q or P; Y7 is selected from the group consisting of H, N, C, F, D, Y, S, and analogs and isomers thereof, preferably Y7 is C; N8 is selected from the group consisting of N, K, H, I, D, Y, T, S, and analogs and isomers thereof; S9 is selected from the group consisting of T, S, P, A, L, and analogs and isomers thereof, preferably S9 is S; R10 is selected from the group consisting of G, K, M, R, S, T, W, and analogs and isomers thereof, preferably R10 is M; Fl 1 is selected from the group consisting of C, F, I, L, S, V, Y, and analogs and isomers thereof, preferably Fl 1 is V; E12 is present or absent, and if present E12 is selected from the group consisting of Q, V, K, A, E, D, G, and analogs and isomers thereof, preferably E12 is E; S13 is present or absent, and if present S13 is selected from the group consisting of L, A, S, P, W, T, and analogs and isomers thereof; V14 is present or absent, and if present VI4 is selected from the group consisting of V, A, D, G, F, I, L, and analogs and isomers thereof, preferably V14 is V; A15 is present or absent, and if present A15 is selected from the group consisting of S, P, V, T, A, G, D, and analogs and isomers thereof; and W16 is present or absent, and if present W16 is selected from the group consisting of C, G, L, R, S, and analogs and isomers thereof, preferably W16 is L. Sometimes, the peptide comprises at least 50%, 56%, 62%, 68%, 75%, 81%, 87%, 93%, or 99% sequence identity to GEAPSPYNSRFESVAW (SEQ ID NO: 47). Sometimes the peptide comprises, consists essentially of, or consists of GEAPSPYNSRFESVAW (SEQ ID NO: 47), GKAPSPYNSRFESVAW (SEQ ID NO: 48), GEVPSPYNSRFESVAW (SEQ ID NO: 49), GEAPSPYNSRFESVAW (SEQ ID NO: 50), GEAPSPYNSRF (SEQ ID NO: 59), or VGEAPSPYNSRFESVAW (SEQ ID NO: 60). In some embodiments, the peptide is about, at least, or at most, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 16 to about 50 amino acids in length, about 16 to about 40 amino acids in length, about 16 to about 30 amino acids in length, or about 16 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90,%, 95% , or 99% sequence identity to a portion of a sequence according to Formula NA-23, wherein the portion of the sequence according to Formula NA-23 is about, or at least, 8, 9, 10, 11, 12, 13,

14, or 15 contiguous amino acid positions of the sequence according to Formula NA-23. [0489] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula NA-24 (SEQ ID NO: 116): F1-S2-Y3-K4-Y5- G6-N7-G8-V9-W10 (Formula NA-24) wherein Fl is selected from the group consisting of F, L, I, C, Y, V, S, and analogs and isomers thereof; S2 is selected from the group consisting of T, P, S, L, A, and analogs and isomers thereof; Y3 is selected from the group consisting of H, N, C, D, F, Y, S, and analogs and isomers thereof, preferably Y3 is D; K4 is selected from the group consisting of M, N, K, E, T, R, Q, and analogs and isomers thereof, preferably K4 is K or R; Y5 is selected from the group consisting of S, Y, F, N, H, D, C, and analogs and isomers thereof; G6 is selected from the group consisting of G, R, S, D, A, C, V, and analogs and isomers thereof, preferably G6 is D; N7 is selected from the group consisting of T, S, Y, D, N, K, H, I, and analogs and isomers thereof, preferably N7 is D; G8 is selected from the group consisting of C, A, G, D, S, R, V, and analogs and isomers thereof, preferably G8 is C or G; V9 is selected from the group consisting of D, G, V, I, L, A, F, and analogs and isomers thereof, preferably V9 is G or F; and W10 is selected from the group consisting of L, S, R, G, C, and analogs and isomers thereof, preferably W10 is L. Sometimes, the peptide comprises at least 50%>, 60%>, 70%>, 80%>, 90%, or 99% sequence identity to FSYKYGNGVW (SEQ ID NO: 40). Sometimes the peptide comprises, consists essentially of, or consists of FSYKYGNGVW (SEQ ID NO: 40) or FSFKYGNGVW (SEQ ID NO: 41). In some embodiments, the peptide is about, at least, or at most, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49,50, or 75 amino acids in length. In some embodiments, the peptide is about 10 to about 50 amino acids in length, about 10 to about 40 amino acids in length, about 10 to about 30 amino acids in length, or about 10 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula NA-24, wherein the portion of the sequence according to Formula NA-24 is about, or at least, 8 or 9 contiguous amino acid positions of the sequence according to Formula NA-24. [0490] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula NA-25 (SEQ ID NO: 117): C1-M2-R3-P4-C5-

F6-W7-V8-E9-L 10-11 1 (Formula NA-25) wherein CI is selected from the group consisting of C, G, F, Y, S, R, W, and analogs and isomers thereof; M2 is selected from the group consisting of V, T, L, I, K, R, and analogs and isomers thereof, preferably M2 is L; R3 is selected from the group consisting of W, T, M, K, R, G, S, and analogs and isomers thereof, preferably R3 is R; P4 is selected from the group consisting of L, A, H, T, P, S, R, and analogs and isomers thereof, preferably P4 is H or P; C5 is selected from the group consisting of S, G, F, C, W, R, Y, and analogs and isomers thereof, preferably C5 is F; F6 is selected from the group consisting of V, S, Y, F, C, L, I, and analogs and isomers thereof; W7 is selected from the group consisting of L, S, R, G, C, and analogs and isomers thereof; V8 is selected from the group consisting of D, L, F, A, I, G, V, and analogs and isomers thereof, preferably V8 is V; E9 is selected from the group consisting of E, K, A, G, D, Q, V, and analogs and isomers thereof; L10 is selected from the group consisting of V, S, L, I, F, and analogs and isomers thereof, preferably L10 is L; and 11 1 is selected from the group consisting of S, T, N , I, L, F, V, M, and analogs and isomers thereof, preferably I I 1 is V . Sometimes, the peptide comprises at least 45%, 54%, 63%, 72%, 81%, 90%, or 99% sequence identity to CMRPCFWVELI (SEQ ID NO: 42). Sometimes the peptide comprises, consists essentially of, or consists of CMRPCFWVELI (SEQ ID NO: 42), CIRPCFWVELI (SEQ ID NO: 43), or CMRPFFWVELI (SEQ ID NO: 44). In some embodiments, the peptide is about, at least, or at most, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 11 to about 50 amino acids in length, about 11 to about 40 amino acids in length, about 11 to about 30 amino acids in length, or about 11 to about 20 amino acids in length. Also provided is a peptide comprising, consisting essentially of, or consisting of a sequence with at least 50%>, 55%>, 60%>, 65%, 70%, 75%, 80%, 85%, 90,%, 95%, or 99% sequence identity to a portion of a sequence according to Formula NA-25, wherein the portion of the sequence according to Formula NA-25 is about, or at least, 8, 9, or 10 contiguous amino acid positions of the sequence according to Formula NA-25. [0491] In some embodiments, a peptide described herein comprises, consists essentially of, or consists of a sequence according to Formula Segment 7-26 (SEQ ID NO: 118): K1-N2-T3-

D4-L5-E6-V7-L8-M9-E 10-Wl 1-L 12-K1 3-T 14-R1 5-P 16-11 7-L 18-S 19-P20-L2 1 (Formula Segment 7-26) wherein K l is selected from the group consisting of M, N , K, E, T, Q, R, and analogs and isomers thereof, preferably K l is M, N , K, T, R; N2 is selected from the group consisting of T, Y, K, I, S, D, H, N , and analogs and isomers thereof, preferably N2 is T, Y, K, S or N ; T3 is selected from the group consisting of I, N , A, S, P, T, and analogs and isomers thereof, preferably T3 is I, N , A, S or T; D4 is selected from the group consisting of V, H, N , Y, A, G, E, D, and analogs and isomers thereof, preferably D4 is N , Y, A, E or D; L5 is selected from the group consisting of P, R, V, F, I, L, H, and analogs and isomers thereof, preferably L5 is I or L; E6 is selected from the group consisting of G, E, D, A, K, V, Q, and analogs and isomers thereof, preferably E6 is E or D; V7 is selected from the group consisting of V, L, I, D, G, F, A, and analogs and isomers thereof, preferably V7 is V, I, D, G or A; L8 is selected from the group consisting of F, H, I, L, R, P, V , and analogs and isomers thereof, preferably L8 is I or L; M9 is selected from the group consisting of R, V , T, K, I, L, and analogs and isomers thereof, preferably M9 is R, V, K, I or L; E10 is selected from the group consisting of K, A, G, E, D, Q, V, and analogs and isomers thereof, preferably E10 is E; W l 1 is selected from the group consisting of R, S, G, C, L, and analogs and isomers thereof; L12 is selected from the group consisting of V , R, P, Q, L, I, and analogs and isomers thereof, preferably L12 is L or I; K13 is selected from the group consisting of T, Q, R, M, N , K, E, and analogs and isomers thereof, preferably K13 is N ; T14 is selected from the group consisting of K, T, S, A, I, P, R, and analogs and isomers thereof, preferably T14 is K, T or P; R15 is selected from the group consisting of S, R, T, I, K, G, and analogs and isomers thereof, preferably R l 5 is R or I; PI6 is selected from the group consisting of Q, P, S, R, T, L, A, and analogs and isomers thereof, preferably P16 is Q or P; 117 is selected from the group consisting of T, V, S, F, M, L, N , I, and analogs and isomers thereof, preferably 117 is I; L I 8 is selected from the group consisting of M, L, V, R, Q, P, and analogs and isomers thereof, preferably L I 8 is L or V ; S19 is selected from the group consisting of A, L, P, S, T, and analogs and isomers thereof; P20 is selected from the group consisting of T, P, S, R, A, L, H, and analogs and isomers thereof, preferably P20 is T, P or H; and L21 is selected from the group consisting of V , R, P, Q, L, M, and analogs and isomers thereof, preferably L21 is V, L or M . Sometimes, the peptide comprises at least 47%, 52%, 57%>, 6 1 >, 66%, 71%, 76%, 80%, 85%, 90%, 95%, or 99% sequence identity to KNTDLEVLMEWLKTRPILSPL (SEQ ID NO: 51). Sometimes a peptide comprises, consists essentially of, or consists of is KNTDLEVLMEWLKTRPILSPL (SEQ ID NO: 51), KSTDLEVLMEWLKTRPILSPL (SEQ ID NO: 52), or KNTDLEALMEWLKTRPILSPL (SEQ ID NO: 53). In some embodiments, the peptide is about, at least, or at most, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or 75 amino acids in length. In some embodiments, the peptide is about 2 1 to about 50 amino acids in length, about 2 1 to about 40 amino acids in length, about 2 1 to about 30 amino acids in length, or about 2 1 to about 25 amino acids in length. Also provided is a peptide comprising, consisting

essentially of, or consisting of a sequence with at least 50%, 55%, 60%>, 65%>, 70%>, 75%>, 80%>,

85%o, 90,%o, 95% , or 99%> sequence identity to a portion of a sequence according to Formula Segment 7-26, wherein the portion of the sequence according to Formula Segment 7-26 is about,

or at least, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 contiguous amino acid positions of the sequence according to Formula Segment 7-26.

[0492] In some embodiments, a composition comprises, consists essentially of, or consists of one or more peptides, which may or may not be purified peptides, according to the present invention. As used herein, a composition "comprising" one or more peptides according to the present invention can mean that the composition can contain other compounds, including one or more proteins that are not peptides according to the present invention (e.g., a peptide of the present invention can be a peptide with a sequence falling within the scope of Formula PB2-1 (SEQ ID NO: 93), Formula PB2-2 (SEQ ID NO: 94), Formula PB2-3 (SEQ ID NO: 95), Formula PB1-4 (SEQ ID NO: 96), Formula PB1-5 (SEQ ID NO: 97), Formula PB1-6 (SEQ ID NO: 98), Formula PB1-7 (SEQ ID NO: 99), Formula PB1-8 (SEQ ID NO: 100), Formula PB1-9 (SEQ ID NO: 101), Formula PBl-10 (SEQ ID NO: 102), Formula PBl-1 1 (SEQ ID NO: 103),

Formula PB1-12 (SEQ ID NO: 104), Formula PA- 13 (SEQ ID NO: 105), Formula HA- 14 (SEQ

ID NO: 106), Formula NP- 15 (SEQ ID NO: 107), Formula NP- 16 (SEQ ID NO: 108), Formula

NP-17 (SEQ ID NO: 109), Formula NP- 18 (SEQ ID NO: 110), Formula NP- 19 (SEQ ID NO: 111), Formula NP-20 (SEQ ID NO: 112), Formula NA-21 (SEQ ID NO: 113), Formula NA-22 (SEQ ID NO: 114), Formula NA-23 (SEQ ID NO: 115), Formula NA-24 (SEQ ID NO: 116), Formula NA-25 (SEQ ID NO: 117), or Formula Segment 7-26 (SEQ ID NO: 118)). As used herein, a composition "consisting essentially of one or more peptides according to the present invention can mean that the composition can comprise other compounds in addition to the peptides according to the present invention so long as the additional compounds do not materially change the activity or function of the one or more peptides that are contained in the composition. As used herein, a composition "consisting of one or more peptides according to the present invention can mean that the composition does not contain other proteins in addition to the one or more peptides according to the present invention. Compositions consisting of one or more peptides according to the present invention can comprise ingredients other than proteins, e.g., pharmaceutically acceptable carriers, surfactants, preservatives, etc. In some embodiments, compositions consisting of one or more peptides of the present invention can contain insignificant amounts of contaminants, which can include peptide contaminants, e.g., smaller fragments of the one or more peptides of the present invention, which may result from, for example, the synthesis of the one or more peptides of the present invention, subsequent processing, storage conditions, and/or protein degradation.

[0493] In some embodiments, a composition, e.g., a vaccine, comprises one or more peptides described herein, e.g., one or more peptides comprising, consisting essentially of, or consisting of a sequence according to Formula PB2-1 (SEQ ID NO: 93), Formula PB2-2 (SEQ ID NO: 94), Formula PB2-3 (SEQ ID NO: 95), Formula PB1-4 (SEQ ID NO: 96), Formula PB1-5 (SEQ ID NO: 97), Formula PB1-6 (SEQ ID NO: 98), Formula PB1-7 (SEQ ID NO: 99), Formula PBl-8 (SEQ ID NO: 100), Formula PBl-9 (SEQ ID NO: 101), Formula PBl-10 (SEQ ID NO: 102), Formula PBl-1 1 (SEQ ID NO: 103), Formula PB 1-12 (SEQ ID NO: 104),

Formula PA- 13 (SEQ ID NO: 105), Formula HA- 14 (SEQ ID NO: 106), Formula NP-15 (SEQ ID NO: 107), Formula NP- 16 (SEQ ID NO: 108), Formula NP- 17 (SEQ ID NO: 109), Formula NP-18 (SEQ ID NO: 110), Formula NP-19 (SEQ ID NO: 111), Formula NP-20 (SEQ ID NO: 112), Formula NA-21 (SEQ ID NO: 113), Formula NA-22 (SEQ ID NO: 114), Formula NA-23 (SEQ ID NO: 115), Formula NA-24 (SEQ ID NO: 116), Formula NA-25 (SEQ ID NO: 117), or Formula Segment 7-26 (SEQ ID NO: 118), or a portion thereof of at least 8 contiguous amino acids. In some embodiments, the one or more peptides preferably comprise amino acids with an RF index of about 0.1 or more, 0.2 or more, about 0.3 or more, about 0.4 or more, or about 0.5 of more. In some embodiments, the composition comprises an adjuvant. In some embodiments, the adjuvant is a lipid. In some embodiments, the adjuvant is an aluminum salt. In some embodiments, the lipid is a palmitoyl group. In some embodiments, the one or more peptides are connected to a CD4+ (helper) T cell epitope. In some embodiments, the composition, e.g., vaccine, is administered to a subject. In some embodiments, the composition, e.g., vaccine, is used to treat a subject who has an influenza infection, such as an influenza A virus infection, an influenza B virus infection, or an influenza C virus infection. In some embodiments, the composition, e.g., vaccine, is used in a vaccination method against the infection of influenza A virus, influenza B virus, or influenza C virus. In some embodiments, the composition, e.g., vaccine, offers cross-protection against the different strains associated with the influenza A virus, the influenza B virus, and/or the influenza C virus. [0494] A composition, e.g., a vaccine, can comprise about 5 to about 25, about 10 to about 25, about 10 to about 50, about 10 to about 100, or about 25 to about 100 peptides described herein, e.g., peptides comprising, consisting essentially of, or consisting of a sequence according to Formula PB2-1 (SEQ ID NO: 93), Formula PB2-2 (SEQ ID NO: 94), Formula PB2-3 (SEQ ID NO: 95), Formula PB1-4 (SEQ ID NO: 96), Formula PB1-5 (SEQ ID NO: 97), Formula PB1-6 (SEQ ID NO: 98), Formula PB1-7 (SEQ ID NO: 99), Formula PB1-8 (SEQ ID NO: 100), Formula PB1-9 (SEQ ID NO: 101), Formula PBl-10 (SEQ ID NO: 102), Formula PBl-1 1 (SEQ ID NO: 103), Formula PBl-12 (SEQ ID NO: 104), Formula PA-13 (SEQ ID NO: 105), Formula

HA- 14 (SEQ ID NO: 106), Formula NP- 15 (SEQ ID NO: 107), Formula NP- 16 (SEQ ID NO: 108), Formula NP- 17 (SEQ ID NO: 109), Formula NP- 18 (SEQ ID NO: 110), Formula NP- 19 (SEQ ID NO: 111), Formula NP-20 (SEQ ID NO: 112), Formula NA-21 (SEQ ID NO: 113), Formula NA-22 (SEQ ID NO: 114), Formula NA-23 (SEQ ID NO: 115), Formula NA-24 (SEQ ID NO: 116), Formula NA-25 (SEQ ID NO: 117), or Formula Segment 7-26 (SEQ ID NO: 118), or a portion thereof of at least 8 contiguous amino acids. A composition, e.g., a vaccine, can comprise at least, or at most, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90 or 100 peptides described herein, e.g., peptides comprising, consisting essentially of, or consisting of a sequence according to Formula PB2-1 (SEQ ID NO: 93), Formula PB2-2 (SEQ ID NO: 94), Formula PB2-3 (SEQ ID NO: 95), Formula PB1-4 (SEQ ID NO: 96), Formula PB1-5 (SEQ ID NO: 97), Formula PB1-6 (SEQ ID NO: 98), Formula PB1-7 (SEQ ID NO: 99), Formula PB1-8 (SEQ ID NO: 100), Formula PB1-9 (SEQ ID NO: 101), Formula PBl-10 (SEQ ID NO: 102), Formula PBl-1 1 (SEQ ID NO: 103), Formula PBl-12 (SEQ ID NO: 104), Formula PA-13 (SEQ ID NO: 105), Formula HA- 14 (SEQ ID NO: 106), Formula NP-15 (SEQ ID NO: 107), Formula

NP-16 (SEQ ID NO: 108), Formula NP- 17 (SEQ ID NO: 109), Formula NP- 18 (SEQ ID NO: 110), Formula NP-19 (SEQ ID NO: 111), Formula NP-20 (SEQ ID NO: 112), Formula NA-21 (SEQ ID NO: 113), Formula NA-22 (SEQ ID NO: 114), Formula NA-23 (SEQ ID NO: 115), Formula NA-24 (SEQ ID NO: 116), Formula NA-25 (SEQ ID NO: 117), or Formula Segment 7- 26 (SEQ ID NO: 118), or a portion thereof of at least 8 contiguous amino acids. [0495] In some embodiments, provided herein are engineered viruses comprising one or more of the sequences described herein, e.g., sequences according to Formula PB2-1 (SEQ ID NO: 93), Formula PB2-2 (SEQ ID NO: 94), Formula PB2-3 (SEQ ID NO: 95), Formula PB1-4 (SEQ ID NO: 96), Formula PB1-5 (SEQ ID NO: 97), Formula PB1-6 (SEQ ID NO: 98), Formula PB1-7 (SEQ ID NO: 99), Formula PB1-8 (SEQ ID NO: 100), Formula PB1-9 (SEQ ID NO: 101), Formula PBl-10 (SEQ ID NO: 102), Formula PBl-1 1 (SEQ ID NO: 103), Formula PBl-12 (SEQ ID NO: 104), Formula PA-13 (SEQ ID NO: 105), Formula HA- 14 (SEQ ID NO: 106), Formula NP-15 (SEQ ID NO: 107), Formula NP-16 (SEQ ID NO: 108), Formula NP- 17 (SEQ ID NO: 109), Formula NP-18 (SEQ ID NO: 110), Formula NP-19 (SEQ ID NO: 111), Formula NP-20 (SEQ ID NO: 112), Formula NA-21 (SEQ ID NO: 113), Formula NA-22 (SEQ ID NO: 114), Formula NA-23 (SEQ ID NO: 115), Formula NA-24 (SEQ ID NO: 116), Formula NA-25 (SEQ ID NO: 117), or Formula Segment 7-26 (SEQ ID NO: 118).

[0496] In some embodiments, a nucleic acid molecule provided herein encodes a peptide described herein, e.g., a peptide comprising, consisting essentially of, or consisting of a sequence according to Formula PB2-1 (SEQ ID NO: 93), Formula PB2-2 (SEQ ID NO: 94), Formula PB2-3 (SEQ ID NO: 95), Formula PB1-4 (SEQ ID NO: 96), Formula PB1-5 (SEQ ID NO: 97), Formula PB1-6 (SEQ ID NO: 98), Formula PB1-7 (SEQ ID NO: 99), Formula PBl-8 (SEQ ID NO: 100), Formula PBl-9 (SEQ ID NO: 101), Formula PBl-10 (SEQ ID NO: 102), Formula PBl-1 1 (SEQ ID NO: 103), Formula PBl-12 (SEQ ID NO: 104), Formula PA-13 (SEQ ID NO: 105), Formula HA- 14 (SEQ ID NO: 106), Formula NP-15 (SEQ ID NO: 107), Formula NP-16 (SEQ ID NO: 108), Formula NP- 17 (SEQ ID NO: 109), Formula NP-18 (SEQ ID NO: 110), Formula NP-19 (SEQ ID NO: 111), Formula NP-20 (SEQ ID NO: 112), Formula NA-21 (SEQ ID NO: 113), Formula NA-22 (SEQ ID NO: 114), Formula NA-23 (SEQ ID NO: 115), Formula NA-24 (SEQ ID NO: 116), Formula NA-25 (SEQ ID NO: 117), or Formula Segment 7- 26 (SEQ ID NO: 118), or a portion thereof of at least 8 contiguous amino acids. [0497] In some embodiments, a composition, e.g., a vaccine, comprises one or more nucleic acid molecules encoding one or more peptides described herein, e.g., one or more peptides comprising, consisting essentially of, or consisting of a sequence according to Formula PB2-1 (SEQ ID NO: 93), Formula PB2-2 (SEQ ID NO: 94), Formula PB2-3 (SEQ ID NO: 95), Formula PB1-4 (SEQ ID NO: 96), Formula PB1-5 (SEQ ID NO: 97), Formula PB1-6 (SEQ ID NO: 98), Formula PB1-7 (SEQ ID NO: 99), Formula PBl-8 (SEQ ID NO: 100), Formula PBl-9 (SEQ ID NO: 101), Formula PBl-10 (SEQ ID NO: 102), Formula PBl-1 1 (SEQ ID NO: 103), Formula PBl-12 (SEQ ID NO: 104), Formula PA-13 (SEQ ID NO: 105), Formula HA- 14 (SEQ ID NO: 106), Formula NP-15 (SEQ ID NO: 107), Formula NP-16 (SEQ ID NO: 108), Formula NP-17 (SEQ ID NO: 109), Formula NP-18 (SEQ ID NO: 110), Formula NP-19 (SEQ ID NO: 111), Formula NP-20 (SEQ ID NO: 112), Formula NA-21 (SEQ ID NO: 113), Formula NA-22 (SEQ ID NO: 114), Formula NA-23 (SEQ ID NO: 115), Formula NA-24 (SEQ ID NO: 116), Formula NA-25 (SEQ ID NO: 117), or Formula Segment 7-26 (SEQ ID NO: 118), or a portion thereof of at least 8 contiguous amino acids. [0498] In some embodiments, a protein described herein (e.g., an antibody or antibody fragment) binds a sequence described herein, e.g., a sequence according to Formula PB2-1 (SEQ ID NO: 93), Formula PB2-2 (SEQ ID NO: 94), Formula PB2-3 (SEQ ID NO: 95), Formula PB1-4 (SEQ ID NO: 96), Formula PB1-5 (SEQ ID NO: 97), Formula PB1-6 (SEQ ID NO: 98), Formula PBl-7 (SEQ ID NO: 99), Formula PBl-8 (SEQ ID NO: 100), Formula PBl-9 (SEQ ID NO: 101), Formula PBl-10 (SEQ ID NO: 102), Formula PBl-1 1 (SEQ ID NO: 103), Formula PBl-12 (SEQ ID NO: 104), Formula PA-13 (SEQ ID NO: 105), Formula HA- 14 (SEQ ID NO:

106), Formula NP- 15 (SEQ ID NO: 107), Formula NP- 16 (SEQ ID NO: 108), Formula NP- 17 (SEQ ID NO: 109), Formula NP-18 (SEQ ID NO: 110), Formula NP-19 (SEQ ID NO: 111), Formula NP-20 (SEQ ID NO: 112), Formula NA-21 (SEQ ID NO: 113), Formula NA-22 (SEQ ID NO: 114), Formula NA-23 (SEQ ID NO: 115), Formula NA-24 (SEQ ID NO: 116), Formula NA-25 (SEQ ID NO: 117), or Formula Segment 7-26 (SEQ ID NO: 118), or a portion thereof of at least 8 contiguous amino acids. [0499] In some embodiments, a composition, e.g., a vaccine, can comprise one or more proteins (e.g., antibodies or antibody fragments) that bind a sequence described herein, e.g., a sequence according to Formula PB2-1 (SEQ ID NO: 93), Formula PB2-2 (SEQ ID NO: 94), Formula PB2-3 (SEQ ID NO: 95), Formula PB1-4 (SEQ ID NO: 96), Formula PB1-5 (SEQ ID NO: 97), Formula PB1-6 (SEQ ID NO: 98), Formula PBl-7 (SEQ ID NO: 99), Formula PBl-8 (SEQ ID NO: 100), Formula PBl-9 (SEQ ID NO: 101), Formula PBl-10 (SEQ ID NO: 102), Formula PBl-1 1 (SEQ ID NO: 103), Formula PBl-12 (SEQ ID NO: 104), Formula PA-13 (SEQ ID NO: 105), Formula HA- 14 (SEQ ID NO: 106), Formula NP-15 (SEQ ID NO: 107), Formula NP-16 (SEQ ID NO: 108), Formula NP- 17 (SEQ ID NO: 109), Formula NP-18 (SEQ ID NO: 110), Formula NP-19 (SEQ ID NO: 111), Formula NP-20 (SEQ ID NO: 112), Formula NA-21 (SEQ ID NO: 113), Formula NA-22 (SEQ ID NO: 114), Formula NA-23 (SEQ ID NO: 115), Formula NA-24 (SEQ ID NO: 116), Formula NA-25 (SEQ ID NO: 117), or Formula Segment 7- 26 (SEQ ID NO: 118), or a portion thereof of at least 8 contiguous amino acids. [0500] Some of the peptide formulas as set forth herein were derived from Table S1 of Wu et al. Scientific Reports 2014 which is herein incorporated by reference in its entirety. Some of the peptide formulas as set forth herein were derived from the RF indices that were obtained from the experiments set forth in Wu et al. "High-Throughput Functional Annotation of Influenza A Virus Genome at Single-Nucleotide Resolution" bioRxiv, doi: HyperTextTransferProtocol://dx.doi.org/10.1 101/005702 (wherein "HyperTextTransferProtocol" is "http") (Wu et al. bioRxiv 2014), EPub May 31, 2014, which is incorporated by reference in its entirety. In some embodiments, a peptide according to the present invention is one having a sequence according to one of the formulas as set forth herein and wherein one or more or all of the amino acid residues are those which have an RF index of about 0.1 or more. In some embodiments, a peptide according to the present invention is one having a sequence according to one of the formulas as set forth herein and wherein one or more or all of the amino acid residues are those which have an RF index of about 0.2 or more. In some embodiments, a peptide according to the present invention is one having a sequence according to one of the formulas as set forth herein and wherein one or more or all of the amino acid residues are those which have an RF index of about 0.3 or more. [0501] In some embodiments, provided herein are methods for rational drug design, e.g., in silico analysis to identify external drugs and/or conditions that disrupt an activity of a sequence having multiple residues with a high RF index, e.g., residues with an RF index of about 0.1 or more, or about 0.2 or more.

[0502] EXAMPLE 1: INFLUENZA PEPTIDES AND EPITOPES

[0503] In this example, a high complexity random point mutagenesis library of the complete influenza A virus genome was generated with next-generation sequencing, and subjecting the mutant pool to growth selection in culture. Quantitative changes in mutations at each position yielded a functional profile of about 80% of the genome at single base-resolution, revealing degree of mutability on both a global and atomic level. Finer analyses of polymerase protein PA and the surface glycoprotein HA aided predictions of previously unknown structure-function, residues affecting structural stability, and potential for mutational escape from immunity. Nonviable polymerase mutants tested in a polymerase assay revealed a PB2 mutant deficient in a function downstream of genome replication. This method can be applied to other viral or microbial genomes capable of being manipulated, e.g., mutated.

[0504] Construction of viral mutant libraries [0505] To study the full genome of influenza A virus at the nucleotide level, in a comprehensive and unbiased manner, a reverse genetics system was used. Infectious virus can be produced by transfection of a producer cell line with plasmids carrying the 8 segments of the viral genome (Fig. 2), allowing for manipulation of the viral genome with simple subcloning procedures. Mutant libraries based on the 8-plasmid system (Hoffmann, E., et al. (2000) PNAS USA 97, 6108-61 13) for the A/WSN/1933(H1N1) strain were constructed, by introducing random point mutations by error-prone PCR, generating 8 separate mutant libraries. The 8 segments can be obtained from PB2, PBl, PA, HA, NP, NA, M1/M2, or NSl/NEP. PB2 can be a part of the RNA-dependent RNA polymerase complex, which can facilitate "cap-snatching" from host pre-mRNA molecules to initiate transcription, and can be conducive for replication. PBl can be the RNA-dependent RNA polymerase, which can bind to terminal ends of vRNA and cRNA for initiation of transcription and replication and can catalyze the sequential addition of nucleotides during RNA chain elongation. PA can be essential for viral transcription and replication and can have endonuclease activity. In some instances, PA does not correlate with polymerase activity. HA can bind sialic acid on cell surface for attachment, and can undergo conformational change with low pH exposing fusion peptide which can interact with the endosomal membrane, forming a pore through which the viral R Ps can be released into the cytoplasm. NP can coat viral RNA to form viral ribonucleoprotein (vRNP) complex, which can be critical for the trafficking of vRNPs into the nucleus. NA can be needed for the final release of virus through cleavage of the HA-sialic acid bond which can anchor virus to cell membrane. NA can also prevent virus particles from aggregating. Ml can form intermediate core of virion and tethers NP w/vRNPs, and can drive budding of virus from the cell membrane. M2 can have ion channel activity, and can conduct protons from acidified endosomes into viral particle resulting in pH dependent dissociation of vRNP from the remainder of viral components. NS1 can inhibit cellular antiviral Type 1 Interferon response, and can dependent on binding to dsRNA. NEP/NS2 can be necessary for nuclear export of vRNP through recruitment of cellular export machinery. Complexity and coverage were estimated from Sanger sequencing of several clones/library (Fig. 2). After deep sequencing, each position displayed a mutation, on average,

in about 20 reads, and 70-80% were covered by 10 mutation occurrences or more ( 1 occurrence = 1virus fragment sequenced) (Fig. 3). [0506] Mutant virus was produced by transfection of the 293T-derived subclone C227, which stably expresses a dominant-negative IRF3 to disable the antiviral interferon pathway, with 1 mutant library plasmid + 7 wild type plasmids (thus 8 separate transfections). Supernatant containing virus was harvested and used to infect the human lung carcinoma line A549, for two successive rounds (Fig. 4). Four populations were obtained for high-throughput sequencing: (1) DNA plasmid library, (2) virus from the C227 transfection, (3) virus from A549

infection Round 1, and (4) virus from A549 infection Round 2 (termed Pops 1-4; Fig. 5A). [0507] Since variant frequencies at each position were to be quantified when many mutations were expected to be < 0.2% in frequency, and SOLiD sequencing, like other deep sequencing platforms, can have a finite error rate (typically 0.1-1%), a method to distinguish between true mutations and sequencing errors was devised (Fig. 5B). Briefly, in the penultimate step of sequencing library processing, copy number of adaptor-ligated fragments was quantified, and as the final step subjected a limited number of molecules to a short run of PCR (about 20 cycles). Thus, each starting molecule was represented by about 10 identical copies; in principle, sequencing error was expected to be a variant present in 1 out of 10, while true mutation was a variant in 10 out of 10. A cutoff for true mutation at p-value of < 10 4 was set (Fig. 6). This method allowed recovery of extensive genetic information from the selection experiments, allowing detection of true sequence variation in most of the viral genome.

[0508] Determination of essentialness at nucleotide level [0509] To visualize the degree of genetic flexibility of influenza A virus to its environment base-by-base, quantitative categories were defined. A "severely impaired" mutation was defined as zero occurrences of the mutation in Pop 4 and at least 10 occurrences in Popl . "Strongly attenuated" mutations had normalized Pop4/Popl ratios > 0 and < 0.05, "moderately attenuated" > 0.05 and < 0.33, "neutral" > 0.33 and < 3, and "enhanced" > 3. On the nucleotide level, the proportion of severely impaired mutants ranged from 4.6% of covered positions (HA and NS1) to 15.8% (M1/M2). Strongly attenuated mutants ranged from 4.5% (HA) to 9.6% (M1/M2) (Fig. 5C). Moderately attenuated mutants ranged from 44.2%, M1/M2 to 71.6%, PB2, while neutrals ranged from 10.1% (PB1) to 36.9% (HA). Enhanced mutants were 0.05-2.2% (Fig. 7, Fig. 8). The corresponding amino acid data is shown in Fig. 9. Number of positions at each Pop4/Popl ratio was plotted, showing a steep negative slope (Fig. 5D, median ratio 0.15). [0510] To assess the reliability of the screening results, a number of single point mutants were reconstructed and their growth properties in isolation were tested. The effects on viral titer mirrored the majority of both severely impaired (Figs. 10A-E; 75% confirmation rate) and neutral mutations (Fig. 10F; 96% confirmation rate). In some cases, infectious units for severely impaired mutants after C227 transfection were below the detectable limit; which were termed "lethal". Transfection of C227 cells with just the mutant plasmid showed that all but one of the lethal mutants was expressed at near-equal levels to the wild type gene (Fig. 11). For other severely impaired mutations, infectious titer became undetectable or barely detectable after one or two rounds of A549 infection (Fig. 10B, Fig. IOC), or after a single short round of infection (Fig. 10D). To better approximate the high-complexity selection conditions, competition assays were performed where cells were infected with a 1:1 ratio of mutantwildtype. In some instances, several mutants displayed marked growth impairment under the competition condition (Fig. 10E). In some cases, some of the mutants unconfirmed in these assays were a reflection of a 50 or 100% single mutant frequency vs. < 1% as in the high-complexity screen; however, the latter scenario may be more representative of viral quasispecies (highly diverse population in a single patient) in nature. [0511] The distributions of each essentialness category by base position along all 8 viral segments are shown in Fig. 12. At the amino acid level, viral protein structures, to the extent that they have been solved, can be color-coded by their essentialness category for PA N-term, PA C-term, HA, NP, NA, M2, and NS1 RNA-binding and effector domains. [0512] Codon position analysis supported the general validity of the selection-sequencing data: total mutations in the 3rd codon position and silent mutations increased from Popl to Pop4, whereas severely impaired mutations, missense, and nonsense mutations decreased from Popl to Pop4 (Fig. 13-Fig. 15). For segments 1-6 (excluding 7-8 because of overlapping or alternate reading frames), severely impaired mutations had the highest frequency of non-conservative missense changes, followed by strongly attenuated ones, whereas neutrals had the highest frequency of conservative changes (Fig. 16).

[0513] Analyses merging essentialness and structure [0514] To evaluate the validity of the selection/sequencing data at the structural level, structural subregions were considered that have been implicated in functions which were expected to be essential for viral replication. To semi-quantify stretches of amino acid essentialness, a scoring system was used: severely impaired = 4; strongly attenuated = 3, moderately attenuated = 2, and neutral = 1. First, the classical bipartite nuclear localization signal (cNLS) on NP (Weber, F., et al. (1998) Virology 250, 9-18), comprising amino acids 198- 216, was found to be largely essential, with positions (K198, R199, R213, K214, and R216) scoring 2, 4, 4, 3, and 2 respectively (Fig. 17A). The entire region from 198-216 had an average score of 2.56 (strongly attenuated; p = 2.2 x 10 3). Second, the major site of the bipartite NLS in PB2 (752-755KRIR) scored 3, 3, 3, and 2, respectively (strongly attenuated; p = 3.06 x 10 3) (data not shown). This region on PB2 can bind to importin-a for nuclear translocation (Tarendeau, F. et al. (2007) Nature Struct. Mol. Biol. 14, 229-233); mutation of K752 to I, which was lethal in the selection, can eliminate hydrogen bonds mediating this interaction (Fig. 17C). The dataset also allowed visualization of domains that were largely dispensable to the virus in their structural contexts. For example, NS1 residues 50-53, 55-56, 63, 67-8, and 70 are identical to a portion of the engrailed homeodomain (Bornholdt & Prasad (2006) Nature Struct. Mol. Biol. 13, 559-560), and all these residues scored 1 in the screen (neutral, p; = 1 x 10 6 ; Fig. 17B). M2 residues 40, 42-45, which lie on the lower half of the ion channel, where the membrane-spanning helices splay outward (Schnell & Chou (2008) Nature 451, 591-595), all scored 1 as well (neutral; p = 3.24 x 10 4 ; Fig. 18). [0515] The essentialness data was probed for the possibility of uncovering novel structural information, first by analyzing the C-terminus of PA (He, X. et al. (2008) Nature 454, 1123- 1126), focusing on severely impaired and strongly attenuated residues. Interaction of PA with PBl may be involved in RNA polymerase function. R663 lies at the proposed interface of this interaction, and due to its electrical charge, can contribute an intermolecular bond between PA and PBl. Similarly, the side chain of Q591 can also lie within the PA-PBl interface. A689 can also be situated to contribute to this interaction, consistent with its location facing outward with no intramolecular interaction with other residues within PA (Fig. 19A). G328 can be located in a loop turn between an alpha helix and a beta strand. C693 on its alpha helix points toward two adjacent helices (Fig. 19B), and this structural subdomain can be involved in the stabilization of the PA molecule as a whole. C453 and C415 can face each other in the core of the protein and can form a disulfide bond (Fig. 19C). The comprehensive data analysis thus provides functional interpretation for the structures. [0516] The N-terminal region (residues 1-256) of PA can be involved in protein stability, endonuclease activity, cap binding, and promoter binding. Mutations in residues 163-178 can inhibit polymerase activity via a reduction in binding to the cR A promoter. In support of this, it was found that 5 of the 16 residues in this stretch were severely impaired or strongly attenuated (Fig. 19D). A loop-strand stretch adjacent to 163-178, residues 146-160, was highly enriched in strongly attenuated positions ( 11 out of 15) (Fig. 19E). [0517] The stalk region of HA can elicitate broadly neutralizing antibody response and can have high sequence conservation among circulating strains. The complex of a broadly neutralizing human antibody (CR6261) and the stem loop ofa l918 HlNl strain has been solved, and the essentialness data offered insight on the mutability of this region. H18, D19, G20, and W21 on the HA2 chain were relatively invariant: D19 was strongly attenuated and W21 was severely impaired. 118 (WSN has isoleucine at this residue instead of histidine) was moderately attenuated, and G20 was not sufficiently covered. Antibody-interacting residues in the HA stem loop were relatively mutable. H38, V40, and L42 in HA2 were moderately attenuated and N41 was neutral. In HA2, T41 and 145 were moderately attenuated and Q42 was strongly attenuated. S291 and L292 in HA1 were neutral (Fig. 19F). [0518] Antigenic sites on the globular head elicited by natural infection and seasonal flu vaccines were neutral or moderately attenuated (Fig. 19G). This observation supports a functional basis for the tendency of this domain to rapidly undergo genetic drift, which can adversely affect both natural and vaccine-induced immunity. A coil with 11 consecutive severely impaired or strongly attenuated residues is adjacent to the CR62651 binding site (Fig. 19H) and can be an alternative vaccine epitope with less potential for immune escape via mutation.

[0519] Potential vaccine epitopes based on essentialness [0520] A constraint on influenza vaccine efficacy can be the antigenic variability of the virus, in both its strain diversity (heterogenicity) and its high rate of spontaneous mutation (antigenic drift). One approach to addressing this problem can be to target highly conserved regions. In some embodiments, essentialness data can also be considered, because non-essential residues within a vaccine epitope can mutate and escape host immunity. In some cases, essentialness does not coincide with sequence conservation at the individual amino acid level, despite a strong correlation between average amino acid essentialness of a whole protein and its relative sequence conservation (Fig. 5E). A list of fourteen stretches especially enriched for essential sequences upon which improved vaccines can be based was compiled. The stretches ranged from scores of 2.63 to 3.44, with five 3.00 or higher (data not shown). 15 of 15 base positions within these fourteen stretches tested by reconstruction of individual mutants were validated as lethal (Fig. 20). [0521] Alignment of sequences in the public flu database showed the extent of correlation

between essentialness and conservation within these fourteen stretches. 13 of the 14 stretches with highly enriched essentialness had at least 8 contiguous residues that were >90% conserved, the minimum length for a CTL epitope (Fig. 21). Three representative randomly chosen 10-mer peptides (Fig. 21) were shown for comparison. One peptide from Ml (position 80-89) with about 100% sequence conservation had an essentialness score of 1.90. In some cases, essentialness and conservation do not correlate. [0522] The 14 stretches were analyzed through an epitope prediction engine, NetMHCpan2.4 (See Hoof, I. et al. (2009) Immunogenet. 61, 1-13). Present within the 14 stretches were seventy-nine 8, 9, 10, and 11-mer peptides that were either strong (8) or weak (71) binders. Generally, "strong" or "weak" binders can comprise most immunogenic epitopes in a pathogen. These covered 10 of the 12 HLA Class I supertype representatives. Fig. 22A shows

11; Fig. 22B shows a list of 13 peptides.

[0523] Additional viralproteinfunctions [0524] Nonviable mutants retaining normal activity in canonical functions can be used to identify additional functions. Polymerase mutants (PB2, PA, and NP) that were confirmed as lethal were examined and their polymerase activity was measured via an artificial luciferase replicon assay. Four lethal PB2 mutants were tested: two of the mutants had no production of vLuc (analogous to vRNA) over background, and one was reduced about 5-fold (Fig. 23A). One lethal mutant (PB2-692), was fully functional in the reporter assay (Fig. 24A, Fig. 24B). Two lethal PA mutants tested had no detectable replicative capability in this assay. All four NP lethal mutants tested had normal levels of activity in the assay (Fig. 23C). [0525] PB2-692 (S226N) was a nonviable mutant with fully effective canonical function, polymerase activity, and it was studied in more detail. In addition to wild type levels of vLuc (corresponding to vRNA), PB2-692 produced wild type (or greater) levels of cLuc and mLuc (corresponding to viral cRNA and mRNA, respectively; Fig. 24B). Luciferase activity itself, which accounts for all three viral RNA species, was slightly higher in PB2-692 than in wild type (Fig. 24C). It was confirmed that viral genomic copies were present in transfected cell supernatant (C227 with all 8 plasmids), even if no infectious virus was apparent. By 48 hours post-transfection, PB2-692 genomic copies (assessed by QPCR for NP) were reduced by about 20-fold below wildtype. This observation can be because of an inability to amplify via multiple rounds of reinfection (Fig. 24D). [0526] Different phases of the influenza replication cycle were examined. Application of wildtype or PB2-692 "virus" with normalized genomic copies resulted in similar levels of viral R A in cell lysates at 6, 10, and 24 hours after infection (Fig. 24E), indicating that PB2-692 can fully enter a cell through binding and fusion. Staining of NP, which can regulate viral nucleoprotein nuclear import and export, showed that nuclear import function of PB2-692 was intact (Fig. 24F). That PB2-692 (NP) copy numbers increased from 6 hpi to 24 hpi similarly to wildtype (6,000-fold and 14,000-fold, respectively, both equal at 24 hpi; Fig. 24E) indicated that its genome replication function was intact, corroborating the results in the artificial replicon assay. Thus PB2 can be implicated in function downstream of transcription and replication. [0527] Viral genomic copies (via NP) in the supernatant of PB2-692-infected A549 cells were detected, although it was reduced (Fig. 24G). Quantitation of all 8 viral segment RNAs showed a generalized, relative defect (about 20-60% reduction in each segment). The decrease in all segments can reflected a defect in the viral production cycle downstream of genome replication, which can implicate an unknown function for PB2. The methods described herein can be used to uncover additional functions for viral proteins that have well-known canonical functions. Related to this, the screen also produced, with functional testing of only a few mutants, a mutant specifically deficient in an unknown function.

[0528] Methods [0529] Cell lines [0530] C227, a subclone of 293T human embryonic kidney cells, stably expresses a dominant negative IRF-3 (IRF-3[5D]) (Hwang, S. et al. (2009) Cell Host Microbe 5, 166-178), and was grown in DMEM plus 10% fetal bovine serum, penicillin/streptomycin, and non essential amino acids. A549 human lung carcinoma cells were grown in RPMI plus 10% fetal bovine serum, penicillin/streptomycin, and non-essential amino acids.

[0531] Subcloning [0532] The WSN (strain A/WSN/33[H1N1]) 8-plasmid reverse genetics system for influenza A virus includes each flu segment inserted between a pol I promoter and a pol I terminator and flanked by the pol II promoter from cytomegalovirus and the polyadenylation signal of bovine growth hormone (vector: pHW2000) (Hoffmann, E. et al. (2000) PNAS USA

97, 6108-61 13 (2000). This arrangement allows for bidirectional transcription of both cRNA and mRNA, which can lead to generation of vRNA and protein, respectively. The flu insert with error-prone polymerase Mutazyme II (Stratagene) was PCR-amplified. Mutation frequency with Mutazyme II can be titratable, dependent on input template amount and number of PCR cycles. In this example, a desirable frequency was a balance between enough mutations per clone to achieve sufficient coverage by SOLiD4 sequencing, and few enough mutations per clone to minimize intragenic interference among mutations. The desirable mutation frequencies were achieved with plasmid equivalent to about g flu insert for 25-30 cycles. The restriction enzyme sites BsmBl and/or Bsal were added to the PCR primers, and cloned into BsmBl- digested parental vector pHW2000. Both of these enzymes can cut downstream of their recognition site, which can allow for inclusion of the viral initiation and termination sequences. Ligations were carried out with high concentration T4 ligase (Invitrogen). Sanger sequencing of a small number of clones (4-20) was used to determine the average mutation frequency, and in some cases, the Mutazyme reaction was performed again, with template amount and cycle number adjusted to aim for the desired frequency of 2-6 mutations/kb. When mutation frequency was satisfactory, transformations were carried out with electrocompetent MegaX DH10B T1R cells (Invitrogen), and at least 100,000 colonies for each segment library were scraped and directly processed for plasmid DNA purification (Qiagen). The complexity of each library was estimated by counting bacterial colonies. Multiplication by the mutation frequency gave the coverage depth. [0533] Point mutants were constructed with the QuikChange XL Mutagenesis kit (Stratagene) according to manufacturer's instructions or with KOD polymerase (Millipore) according to standard KOD protocol.

[0534] Transfections, infections, and titering [0535] C227 cells were transfected with Lipofectamine 2000 (Invitrogen) using 7 wildtype plasmids plus 1 mutant (library) plasmid. Thus, 8 transfections were done in parallel. Media was removed at 16 hours and cells were washed with PBS once. Media was again replaced at 48 hours, and at 72 hours supernatant containing infectious virus was harvested, filtered through a 0.45um MCE filter, and stored at -80°C. Titering was done on A549 cells by limiting dilution endpoint assay (giving TCID50, or tissue culture infectious dose), according to the procedure described in Szretter, K. J., Balish, A. L. & Katz, J. M. "Influenza: Propagation, Quantification, and Storage" in Current Protocols in Microbiology (John Wiley & Sons, 2006). Virus from C227 transfections was used to infect A549 at the MOIs reported in Fig. 4, and 2 hours post infection virus was removed followed by three PBS washes. Cells with the lowest feasible MOIs were infected, in order to minimize transcomplementation of nonfunctional or attenuated virus by viruses with normal replication efficiency. Unless otherwise indicated, virus-containing supernatant was harvested at 24 hours post-infection. A second round of infection on A549 was performed following the first round and titering. Thus, 8 samples for Pop3 and 8 samples for Pop4 were generated. [0536] For each of the rounds of transfection or infection, an absolute amount of infectious virus at least 5x the complexity of each library was used in the subsequent step, in order to maintain the library complexities. Some supernatant was set aside for isolation of viral RNA, some was used for the subsequent infection round, and the remaining was archived.

[0537] Isolation of viral RNA [0538] Supernatant from transfected or infected cells was processed with the Qiamp viral RNA kit (Qiagen), according to manufacturer's instructions but scaling up to 280 µΐ or 560 µΐ supernatant/sample. For the lowest titer segments (PB1 and PA), 5 x 560 µΐ and 10 x 560 µΐ supernatant were processed, respectively, onto multiple Qiamp viral RNA columns, along with the same total amount of carrier RNA as with other samples, then concentrated by passing over a single RNeasy mini column (Qiagen). For C227 supernatant, RNA was treated with DNasel (Qiagen) to eliminate plasmid DNA carryover, then cleaned up over an RNeasy mini column. For intracellular virus, total RNA was prepared using RNeasy mini columns according to manufacturer's instructions. Viral RNA was reverse transcribed with Superscript III (Invitrogen) using random hexamers. Copy number of virus was determined by QPCR for NP against a standard range of NP plasmid. Viral copy number at least 5x the complexity of the library was used as input for PCR recovery of the viral gene. Viral genes were recovered by PCR amplification of non-overlapping about 500-600 bp fragments. Only the mutant segment was amplified for sequencing. Viral genes from the DNA plasmid libraries were PCR-amplified with the same primers, for Popl .

[0539] SOLiD sequencing [0540] The ABI SOLiD4 platform was used for the high-throughput screening. Library preparation for paired end with barcode was adapted from manufacturer's instructions in order to distinguish between sequencing error and rare variants in the pool. PCR fragments (recovered from reverse transcribed viral RNA) were fragmented further by sonication with a Covaris instrument, end repaired, and ligated to adaptors with barcodes. The sequences of the set of 96 adaptors are published by ABI and obtained from Integrated DNA Technologies. All gene fragments and adaptors were mixed in a single pool, keeping each of the four Populations separate. After gel size selection, nick translation and 3 PCR cycles were performed, and number of molecules was quantified by the TaqMan kit associated with the library preparation kit (ABI). 20 million molecules per Pop (each a unique "starting molecule") were subjected to another PCR run of 20 cycles to generate enough material for bead preparation. Bead preparation, emulsion PCR, and the sequencing run were all according to manufacturer's instructions. Each population was deposited on 2 wells of a 4-well slide, with 2 slides in a single run. About 100 million reads per well were obtained, giving about 10 copies per original starting molecule, each sequenced on independent beads. The paired-end run was conducted in three phases: (1) barcode read, (2) reverse read (35 bp), and (3) forward read (50 bp).

[0541] Data processing and analysis [0542] Sequence data was quality filtered with the accompanying ABI software for SOLiD4. Mapping was performed with bfastl.6.5f (Homer, N., et al. (2009) PLoS ONE 4, e7767), with the following criteria: (1) the two reads (forward and reverse) aligned to the same flu gene with one sense and one antisense; (2) the two reads could not overlap; and (3) the fragment enclosed was greater than 50 bp. Popl yielded 249M mappable reads including both forward (50 bp) and reverse (35 bp) reads, Pop2 183M, Pop3 208M, and Pop4 168M. Reverse reads were not of sufficient quality to use for sequence data, but only the end site information for the error- correction method.

[0543] Copies of a single starting molecule were identified by the same start position (depending on the sonication-fragmentation breakpoint), end position (size range of selected fragments was about 150-180 bp), one of 96 barcodes, and orientation with respect to adaptor ligation. The permutations were about 70 million, so that each starting molecule was either unique or, if genuinely different from another identical-looking molecule, still a sufficient fraction of the cluster such that statistical significance could be attained. For example, if 9 or 10 out of 10 reads showed a variant at a given position (or in the case of multiple identical- appearing but different starting molecules, 10 out of 20 or 10 out of 30), this variant was considered as a true mutation; if 1 or 2 out of 10 was a variant, this variant was considered to be sequencing error. A limiting factor for detection of rare variants, rather than sequencing error, was error during PCR recovery of the viral gene ( 1 in 11,000 for KOD (Takagi, M. et al. (1997) Appl. Environ. Microbiol. 63, 4504-4510)). [0544] All reads which were identical for the four parameters listed above were placed in a "cluster". Reads arising from a single starting molecule (prior to PCR amplification) formed a "group". Due to nonrandom sonication-fragmentation, all permutations were not met and thus some clusters included more than one group. Because in this case it could not be distinguished which group a read originated from, a binomial exact test (null hypothesis = 1 %, which was a conservative estimate of the sequencing error rate) was applied to assign a p-value to each variant in each read, and set as a cutoff for true mutation p < 10 4 . For example, 9 reads with the same variant in a cluster consisting of 28 reads would have been significant, whereas 1 or 2 reads with the same variant in this cluster would not have met the p-value threshold. All independent reads within the same group were conflated and counted as one (i.e., one "occurrence", which equals one virus among the about 20 million). Mutation occurrence for each nucleotide was normalized to the number of clusters covering the nucleotide. For each nucleotide, the ratio of the normalized score of Pop4 to the normalized score of Pop 1 (Pop4/Popl ratio) defined the category assignment (severely impaired, strongly attenuated, moderately attenuated, neutral, and enhanced). Only base or amino acid positions with Popl > 10 occurrences are reported. [0545] Sequence entropy, as a measure of conservation across all deposited Influenza A virus sequences, was calculated as in Strait, B. J. & Dewey, T. G. (1996) Biophys. J. 71, 148- 155. Missense mutations were divided into conservative and non-conservative mutations based on the BLOSUM62 matrix: if 0 or greater, were considered as conservative, and if negative, non-conservative. [0546] For calculation of p-values for regions of interest located on previously solved crystal or NMR structures, bootstrap analysis (N = 1,000,000) was performed, using the median Pop4/Popl ratio for the corresponding region as the parameter. The p-value can indicate the probability of having a random equivalent set of residues as extreme as the sample, so deviations in both directions (severely impairing/strongly attenuated or neutral) were considered. [0547] One out of 4 severely impaired mutants was validated, and this one manifested the phenotype only after a single short round (13 hours) of A549 infection (Fig 10D), and the codon positions showed the weakest bias toward the 3rd position for overall mutations and the weakest bias against the 3rd position for severely impairing mutations (Fig. 13 and Fig. 14). The mutation frequency in the original plasmid library (PB1 the highest at 12.8 mutations/clone, Fig. 2) can led to unacceptably high intragenic interference among mutations. [0548] Additional validation methods can include individual mutants which can be reconstructed with 1 mutation/virus using site-directed mutagenesis. Mutants can be tested individually by transfection into C227, collection of supernatant, and titering by limiting dilution. Validation can be achieved for lethal mutations where infectious virus is below the detectable limit (< 20 TCID50). One or two rounds of infection of A549 cells can yield a validation of near-lethal effect. A single short round (13 hours) of A549 infection can be needed to validate the near-lethal effect.

[0549] Competition assay [0550] A549 cell were infected with a 1:1 (volume) ratio of wildtype:mutant at wildtype MOI 0.01 from C227 supernatant. 24 hours later, supernatant was harvested, titered, and used to infect A549 for a second round at MOI 0.01 . 24 hours later, viral RNA was isolated from supernatant by Qiamp viral RNA kit, reverse transcribed with random hexamers, and PCR amplified with specific primers to generate about 200 bp amplicons with the base position in question situated near the middle. The 5' primer incorporated a BamHl recognition site and the 3' primer an Xhol site. Amplicons were digested with restriction enzyme and cloned into pcDNA5 digested with BamHl/Xhol . Individual bacterial colonies were miniprepped and Sanger sequenced (n = 10-20) to estimate the ratio of wildtype to mutant.

[0551] Protein structures [0552] Protein Data Bank IDs for structures presented here are as follows: PA N-term: 3EBJ; PA C-term: 2JDQ; HA: 3LZG; NP: 2IQH; Ml: 1EA3; M2: 2RLF; NS1 RBD: 2ZKO; NS1 effector domain: 3EE9; HA with Ab CR6261 contact sites: 3GBN. All structures were colored for the essentialness categories in PyMol.

[0553] Vaccine stretch scores [0554] All possible 8, 12, and 20-mer peptides from the flu genome were taken and their essentialness scores tabulated: the average scores were 2.02 ± 0.42(s.d.), 2.01 ± 0.39, and 2.01 ± 0.37, respectively. The fourteen stretches ranged from 2.63 to 3.44, with five 3.00 or higher. Percentiles among all possible relevant-length peptides ranged from 93.1 to 99.6, with ten greater than 96.0, and six greater than 98.0.

[0555] Peptide-MHC prediction [0556] NetMHCpan2.8 (Hoof, I. et al. (2009) Immunogenet. 61, 1-13) is a publicly available engine that can give quantitative binding affinities for any peptide and HLA; for HLA alleles which have little or no experimental binding data, affinities can be predicted based on a computational analysis of HLA allele sequence and predicted structure. NetMHCpan2.8 can classify "strong binders" as those either in the top 0.5 percentile for that HLA supertype or with an IC50 of 50 nM or less, and "weak binders" as those in the top 2 percentile for the HLA supertype or an IC50 of 500 nM or less; immunogenic peptides can fall into either of these two classes. Peptides with a residue different in WSN than the consensus, or one of several other "non-optimal" features, were assigned (Fig. 22). In some cases, peptides different in WSN than consensus can be vaccine epitopes if (a) the consensus or WSN residue is tolerated and (b) if MHC still binds the consensus peptide. In some embodiments, the latter is the case.

[0557] Quantitative RT-PCR [0558] After reverse transcription, QPCR was performed on a DNA Engine OPTICON®2 system (MJ Research) using SYBR Green (Molecular Probes) with 60°C annealing and extension step of 30 seconds. Melting curves were monitored to ensure specific amplification. Primers were: NP: F : 5'-GACGATGCAACGGCTGGTCTG-3 ' (SEQ ID NO: 69), R : 5'- ACCATTGTTCC AACTCCTTT-3 ' (SEQ ID NO: 70); PB2: F : CGGAAAGGATGCTGGCCCTTTAAC (SEQ ID NO: 71), R : TCCGTTTCATTACCAACACCAC (SEQ ID NO: 72); PB1: F : TCCTGGATCCCCAAAAGAAATCG (SEQ ID NO: 73), R : TCCAGATTCGAAATCAATTCGTG (SEQ ID NO: 74); PA: F : AAGTACTGGCAGAACTGCAGGAC (SEQ ID NO: 75), R : AATCCAACTTGCAAGCGACCTC (SEQ ID NO: 76); HA: F : AACATGGGAGGATGAACTATTAC (SEQ ID NO: 77), R : AGGGAGATTGCTGTTTATAGCTC (SEQ ID NO: 78); NA: F : ATGGAGCAAACGGAGTAAAGG (SEQ ID NO: 79), R : TTGAACGAAACTTCCGCTGTACC (SEQ ID NO: 80); Segment 7 : F : ACTCATCCTAGCTCCAGTGCTGG (SEQ ID NO: 81), R : TTTCAAACCGTATTTAAAGCGACG (SEQ ID NO: 82); Segment 8 : F : AAGTGGCAGGCCCTCTTTGTATC (SEQ ID NO: 83), R : CCAACTGCATTTTTGACATCCTC (SEQ ID NO: 84).

[0559] Polymerase assay and immunofluorescence [0560] One mutant plus the other three wildtype plasmids (PB2, PB1, PA, and NP) were cotransfected into C227 cells with a vLuciferase reporter plasmid that is specifically responsive to this cohort of viral polymerases (Regan, et al. (2006) J. Virol. 80, 252-60). In this mini- replicon assay, the four viral proteins together can drive transcription and replication of the Luc insert (in the absence of virus production), leading to increased vLuc, cLuc, and mLuc (corresponding to vRNA, cRNA, and mRNA, respectively) which can be measured by QPCR. As baseline controls, three plasmids were transfected, leaving out one. 48 hours after transfection, total cellular RNA was processed with the RNeasy mini kit and reverse transcribed with Superscript III using the gene-specific primers: vLuc: 5'-AGCGAAAGCAGG-3' (SEQ ID NO: 85), cLuc: AGTAGAAACAAGG (SEQ ID NO: 86), or mLuc: oligo(dT). Luc cDNA was then quantified by QPCR, with primers: F : CCAGGGATTTCAGTCGATG (SEQ ID NO: 87) and R : AATCTCACGCAGGCAGTTC (SEQ ID NO: 88). n - 1 controls gave low baseline expression off the reporter cassette, which exceeded by wildtype and functionally competent mutants, but not exceeded by some lethal mutants. Firefly luciferase assay (Promega), normalized to Renilla luciferase, was used to quantify the level of the end product from viral transcription and replication (accounting for all three viral RNA species).

[0561] For NP immunofluorescence, Abnova MAB2442 was used at a dilution of 1:100.

[0562] Discussion [0563] In some cases, studies of viral quasispecies or spontaneously arising drug-resistant mutants can have three methodological shortcomings: (1) incomplete coverage of possible mutants, (2) the absence of inquiry into rare mutations and (3) the lack of controlled "before" and "after" samples (in the case of some studies using clinical samples). The methods described herein can addressed these issues in concert, with a high-coverage, induced mutagenesis; a controlled in vitro selection; and quantification of changes in mutant frequencies by next- generation sequencing. The methods described herein can be used to uncover novel structure- function insights, with the superimposition of a parameter, "essentialness"; and a large set of nonviable mutants which can be probed for mutants competent in the canonical function, which can aid the discovery of additional viral protein functions. The methods described herein can produce a comprehensive list of relatively essential stretches that can be used as vaccine epitopes. The methods described herein can be used to study well-studied or poorly studied pathogens. Methods described herein can be used as a high throughput and generalizable approach that combines exhaustive reverse and forward genetics in a single comprehensive platform. These methods can be applied using another pathogen to generate peptides that can be used as vaccines against the pathogen.

[0564] EXAMPLE II: GENERATION OF LETHAL DOSE CURVE [0565] Generation of lethal dose curve [0566] A lethal dose curve with wildtype (WSN) virus administered intranasally (i.n.) was performed on the transgenic mice with 4 groups of 4 mice each with descending 10-fold dilutions, starting with 1 x 106 TCID50/mouse. These mice can be less "healthy" than their wildtype counterparts. All mice either died or lost enough weight to require euthanization at the first two doses of virus, and one of the four in the third dilution of virus died or lost enough weight to require euthanization. WSN (H1N1) strain is a Biosafety Level 2 (BSL-2) influenza A virus strain.

[0567] EXAMPLE III: IMMUNOGENICS Y OF LIPOPEPTIDES [0568] The following peptides: ATYQRTRAL (SEQ ID NO: 64), YERMCNIL (SEQ ID NO: 66), and GEAPSPYNSRF (SEQ ID NO: 59) were used in Example III. The peptides were lipidated to generate lipopeptide, which can overcome CD8+ (cytotoxic) T cell (CTL) immunogenicity of unmodified peptides. The lipopeptide comprised Pam2Cys (di- palmitoylated cysteine). The peptides were also attached to the CD4+ (helper) T cell epitope (ISQAVHAAHAEINEAGR (SEQ ID NO: 119)). The CD4+ (helper) T cell epitope and unique CTL epitope were connected by a lysine, and the lysine was connected to two serines in tandem, and the terminal serine was connected to Pam2Cys. [0569] Control peptides can also be used in the following example. In some cases, the control peptides are obtained from NP and NA. The control peptides can be selected based on conservation, but not based on invariance. Sometimes, at least one mutation that abolishes binding to the HLA and is relatively non-invariant can be placed within the sequences of each control peptide. [0570] Three peptides were tested for their ability to elicit immunogenicity. Two mice per peptide were immunized intranasally with 50 nmol lipopeptide, and 14 days later boosted with the same amount of lipopeptide. Seven days following this, mice were sacrificed and spleens and lungs taken. Splenocytes or lungs were minced to produce single cell suspensions, and CD8+ T cells purified and ex vivo stimulated by the cognate minimal, unmodified peptide overnight. Standard IFN-a ELISPOT was used to detect an immune response to the lipopeptide. ATYQRTRAL (SEQ ID NO: 64), gave a response in which 19% of CD8+ T cells in the lung were peptide-specific (Fig. 25). In general, the peptides tested induce specific response in about 3 to about 20% of CD8+ T cells. Additional testing showed that C57B1/6J (wildtype) and HLA- B44-transgenic and HLA-B44-transgenic mice immunized with adjuvanted flu peptides ATYQRTRAL (SEQ ID NO: 64), YERMCNIL (SEQ ID NO: 66), and GEAPSPYNSRF (SEQ ID NO: 59), respectively, exhibited peptide-specific IFNy-secreting T cells in the lungs (Fig. 26). SEQ ID NO: 66 is referenced as HLA-B44 peptide # 1 and SEQ ID NO: 59 is referenced as HLA- B44 peptide #2 in Fig. 26.

[0571] Sometimes, two mouse lines can be used. Each mouse line can be immunized with two of the peptides comprising sequence selected from SEQ ID NOs: 1-68. A second set of mouse line can be immunized with two peptides (control) generated using a high conservation criterion alone. [0572] Lack of immunogenicity of other peptides can be due to (a) failure of peptide binding to HLA, (b) absence of precursor T cells with receptors (TCRs) that bind to the peptide -HLA, or (c) failure to stimulate the T cells. Although the % accuracy of the NetMHCpan engine is reported to be 85-90%>, the sample size is small. In some cases, peptide binding to HLA does not translate into a T cell that is available to bind the complex. In some instances, the mouse T cell repertoire is smaller than that in the human.

[0573] Additional experiments can include 5 new peptides (one experimental from NA and one from the viral polymerase PBl, and one control from NA, one from NP, and one from PBl) which meet the same criteria, in two mice per peptide. Further, one HLA-Al -restricted peptide, two HLA-A24-restricted peptides, and two HLA-B44-restricted peptides in HLA-Al -, HLA- A24-, and HLA-B44-transgenic mice will also be tested. For the negative control, a peptide to Hepatitis C virus that is immunogenic but is not predicted to protect against flu virus infection will be used. [0574] EXAMPLE IV: DEGREE OF PROTECTION BY THE LIPOPEPTIDE VACCINES AGAINST WILDTYPE WSN FLU VIRUS [0575] Eight groups of 8 mice each will be used: (1) Experimental NP peptide; (2) experimental NA (unless PB1 gives a better immune response); (3) experimental NP + experimental NA (combination vaccination); (4) control NP; (5) control NA; (6) control NP + control NA; (7) non-flu virus (but immunogenic) negative control; and (8) unimmunized negative control. Combination samples will be tested. In some cases, a combination of peptides including multiple peptides per major HLA group can guard against mutational escape. Mice will be primed i.n. and boosted i.n. 14 days later. Thirty days post-boost, mice will be challenged i.n. with a lethal dose of wildtype WSN virus (as determined in Example II). Mice will be weighed and behavioral signs (e.g., huddling, inactivity, ruffled fur) observed and recorded daily. If weight loss is 25% or more, the mouse can be euthanized. One result can be that the first six groups are protected from weight loss and/or clinical symptoms, relatively or absolutely, and/or death; and the seventh (non-flu virus) group is afforded no protection when compared to the eighth, unimmunized group. Using a predetermined table for power-sample size analysis, with Cohen's d > 1.50 (which is the minimum acceptable for a preclinical vaccine efficacy experiment) and a power ( 1 - false neg. rate) > 0.90, the required sample size can be n = 7 .

[0576] EXAMPLE V: CORRELATION OF CANDIDATE PEPTIDES BETWEEN IMMUNOGENICITY AND HIGH CONSERVATION [0577] Mutant library virus will be used to "mimic" mutational escape in the large global human population. Seven groups of 12 mice each (based on a power-sample size analysis with Cohen's d > 1.50) will be immunized. In some cases, instead of challenging with wildtype virus, the mice will be challenge with the two respective mutant libraries or a combination mutant library produced in cell culture. These libraries can be built on the WSN background to have multiple mutations per virus, with which the next-generation sequencing analysis was conducted for the experimental determination of invariance at the single-base level. A flu "mutant library" means a high-complexity population of hundreds of thousands of different mutant viruses, made in a producer cell line from plasmid mutant library. Since flu virus can be produced with a plasmid based system and each viral gene can be a separate plasmid, each flu gene can have its own mutant library. The NP mutant library (for groups 1 and 4, and the non-flu virus control) and the NA mutant library (for groups 2 and 5) or a double mutant library (groups 3 and 6) will be used. Mathematically, for the combination control sample (3 and 6), approximately 6 double mutant virions in the inoculating sample in each mouse can be present that fail to bind to HLA and are non-invariant (because of this low number, a greater sample size can be utilized in this experiment). For the single peptide controls (4 and 5), about 5000 single mutant viruses in each mouse's inoculating sample that meet these criteria can be present. In some instances, the first 6 groups will initially control the infection such that virtually all virus is eliminated except the

"escape" single mutants (in groups 1, 2, 4, 5) and the escape double mutants (in groups 3 and 6). After a bottleneck, the escape mutant(s) can surge, expand, and cause morbidity and/or mortality in groups 4-6 but not in groups 1-3, because the mutants that escape the peptide vaccine in groups 4-6 are viable while those in groups 1-3 are nonviable. However, the single peptide controls (groups 4 and 5) can have an escape mutant at an about 1000-fold higher frequency; but viral escape and induction of morbidity and mortality in this group can be compared to groups 1 and 2, which can be less likely be fully protected for an indefinite time period than the double peptide group 3.

[0578] EXAMPLE VI: CONSTRUCTION OF INFLUENZA B VIRUS MUTANT LIBRARIES [0579] A selection/screen for influenza B virus will be conducted. An 8-plasmid IBV reverse genetics system will be generated using the B/Lee/40 strain. [0580] The virus will be cloned by growing and amplifying it in MDCK cells (a common cell line used for influenza virus propagation). Culture supernatant will be collected and virus isolated using Qiamp viral RNA kit (Qiagen), and all 8 segments will be cloned by PCR into a bidirectional vector (pHW2000) which can allow for transcription in one direction of one strand of virus (mRNA) and in the opposite direction for the other strand (vRNA, which is the genome present in the virion). The virus will be tested for growth in transfected producer cells and for its infection of target cells. C227 cells will be transfected by Lipofectamine 2000 (Life Technologies), producing virus which then will be used to infect MDCK cells, where it is amplified. By maintaining the ability to both transfect and infect human cells, the selection/screen will involve virus-host cell protein-protein interactions that can be relevant in the human host. The cell line used for infections, A549, a human lung carcinoma line, will be tested for infection by C227-produced IBV. Both of these lines can be effectively transfected/infected. [0581] The mutant library and sequences resulting from progeny virus passage will result in, on average, 1 mutation per clone. This scenario can eliminate the complication of multiple mutations/clone which can result in much "dragging-down" effect, where the presence of a lethal mutation as one of, e.g., 5-10 mutations in a single virus, would eliminate the rest of the mutations, some of which might have been neutral by themselves. Each flu segment will be divided into 150 bp stretches. PCR with primers spaced 150 bp apart is performed using Mutazyme II (Stratagene), an error-prone DNA polymerase. The mutation rate can be titrated to 1 per clone with a shorter sequence; thus, the 150 bp stretch rather than the full segments (900- 2300 bp) will be used. Multiple independent libraries will be constructed for each segment, using about 90 plasmid preps, etc. (the full genome is about 13,000 bp). The application of this methodology to influenza A virus resulted in visualization of neutral mutations (end population/beginning population ratios closer to 1.0). The method can have less dragging-down effect, and fewer false-positive lethals. [0582] The primers used to make the mutant libraries will contain all the sequences for next- generation (Illumina) sequencing. After selection of virus in culture using the same conditions that were used for influenza A virus, virus RNA will be recovered from supernatants using Qiamp. Reverse transcription followed by PCR with high-fidelity KOD polymerase will yield next-generation sequencing-ready material. In some cases, the method for error-correction can include sequencing of about 10 copies independently. In some cases, the error correction method will sequence in both directions. An Illumina HiSeq2000 can sequence 2x150 bp in a single run. The error rate can be about 1%, the error rate for bidirectional sequencing can be (0.01)2 = 1/10,000, assuming that sequencing error is random and not due to "hot-spot" bases prone to sequencing error.

[0583] Selection in culture and next-generation sequencing [0584] The IBV reverse genetics system will be produced in C227 cells and amplified in A549 cells, or MDCK cells if A549 cells are found to be unsuitable for Influenza B virus replication. Virus produced (in the supernatant) will be titered by limiting dilution assay on A549 cells. An MOI of 0.01 will be used to infect A549 cells; titering will be done again, and a second round of A549 infection will be done with a 0.01 MOI. Virus from supernatant at each of these three rounds will be isolated using Qiamp viral RNA kit, reverse transcription, and PCR using the high-fidelity KOD polymerase. This last step also serves to prepare the library for Illumina sequencing, since the end sequences for NGS will be added (contained on the primer) in the PCR step. It will also be prepared by PCR next-gen-ready sample representing the originating DNA plasmid as the pre-selection sample. [0585] Two slides can be sequenced on HiSeq2000 in a single run, with 4 lanes/slide. Each lane can hold 200M tags (molecules), thus bidirectionally sequencing can yield 400M sequences. For 90 150 bp fragments, about 5,000-fold coverage (200M tags/lane / (90 x 150 bp x 3 possible mutations/base) can be obtained in a single lane. There can be four samples (DNA plasmid,

C227 transfection, A549 infection Round 1, and A549 infection Round 2), and four lanes on a single slide can be used for the entire experiment. [0586] Peptide analysis can use similar methods as described for analysis of the influenza A virus peptides and can be based on invariance, high sequence conservation, and predicted binding to HLA. Candidate peptides can be tested in an HLA-transgenic mouse model and/or additional in vivo and in vitro studies.

[0587] Validation by reconstruction of mutants [0588] The sequencing results can be validated by reconstruction of point mutants and growth in culture. A representative collection of lethal and neutral mutations can be compiled and constructed by site-directed mutagenesis. Individual mutants can be tested by transfection of C227 cells followed by infection of A549 cells. Titering by limiting dilution can be done to ascertain a mutant as lethal, neutral, or attenuated. Reconstruction and testing of individual mutants can yield an overall confirmation rate of >80%.

[0589] EXAMPLE VII: HIGH-THROUGHPUT PROFILING OF INFLUENZA A VIRUS HEMAGGLUTININ GENE AT SINGLE-NUCLEOTIDE RESOLUTION [0590] A single-nucleotide resolution genetic approach to interrogate the fitness effect of point mutations in 98% of the amino acid positions in the influenza A virus hemagglutinin (HA) gene was developed. An HA fitness map provides a reference to identify indispensable regions to aid in drug and vaccine design as targeting these regions can increase the genetic barrier for the emergence of escape mutations. A platform is provided for studying genome dynamics, structure-function relationships, virus-host interactions, and further rational drug and vaccine design. The approach can be applied to any virus that can be genetically manipulated. See Wu et al. "High-throughput profiling of influenza A virus hemagglutinin gene at single-nucleotide resolution" Scientific Reports May 13, 2014 (Wu et al. Scientific Reports 2014), herein incorporated by reference in its entirety.

[0591] Results [0592] High-throughput genetic approach at single-nucleotide resolution [0593] A high-throughput genetic platform can be used to randomly mutagenize at each position of the genome, monitor the enrichment or diminishment of each point mutation under a specified growth condition, and perform massive deep-sequencing to determine which mutations are associated with negative, neutral, or positive fitness outcomes under the given growth condition. The mutant library was created on influenza A/WSN/1933 (H1N1) hemagglutinin (HA) gene by performing error-prone PCR on the eight-plasmid reverse genetics system. Subsequently, the viral mutant library was generated by transfection and passaged for two 24- hour replication selection rounds in A549 cells (human lung epithelial carcinoma cells) (Fig. 2 A). The plasmid library and the passaged viral library were each sequenced by Illumina HiSeq 2000. Individual mutants can experience an identical selection pressure with other mutants in the pool during the course of transfection and infection. Comparing the genetic compositions of the plasmid library and the passaged viral library can reflect the variation in replication rates for each mutation. A relative fitness index (RF index) was used as a proxy for the fitness effect of individual mutations. The RF index can be calculated as: RF index = (occurrence frequency in passaged library)/(occurrence frequency in plasmid library) [0594] The occurrence frequency of individual mutations was expected to be lower than the sequencing error rate of 0.1% in the Illumina next generation sequencing (NGS). [0595] A two-step PCR approach was used for library preparation to distinguish true mutations from sequencing errors (Fig. 27B). In the first PCR, the HA gene was divided into 12 amplicons for amplification with a unique tag assigned to individual molecules. In the second PCR, multiple identical copies for individual tagged molecules were generated. The input copy number for the second PCR was well-controlled such that after a sub-saturation PCR, individual tagged molecules would be sequenced about 10 times. In some cases, true mutations would exist in most, if not all, sequencing reads sharing the same tag, whereas sequencing errors would not. This error-correction approach can be based on a valid assumption that occurrence of sequencing error is independent of the identity of the nucleotide tag. Therefore, sequencing errors can be distinguished from true mutations. Individual molecules, each carrying a unique tag, have an average copy number of about 10 (median = 10) in the sequencing data, which verified the sequencing library preparation design.

[0596] Point mutationfitness profiling of hemagglutinin [0597] The RF indices of individual point mutations were profiled across 98% of amino acid positions of HA in biological duplicate (Spearman correlation = 0.78) (Fig. 28A). The remaining 2% of amino acid positions not observed were from the termini of HA, where the first and last amplicon primers are located. Silent mutations and nonsense mutations provided an internal control to access the data quality. In some instances, silent mutations, which alter the nucleotide sequence but not the amino acid sequence, impose a fitness cost. Nonsense mutations, which can result in a truncated protein product, can be lethal to the virus. Silent mutations can have a higher RF index than nonsense mutations (P < 2 e 16, two-tailed Student's t-test) (Fig. 28B). In addition, the RF index distributions of silent mutations and nonsense mutations can be well separated. However, several silent mutations with a low RF index were observed. Silent mutation can play a role in codon usage, RNA structure, and other functions beyond protein-coding.

[0598] The fitness data is consistent with the reported phenotypes of mutants that have been previously characterized in the literature. Examples include a temperature sensitive substitution (Y174H), a host switching substitution (D238G), two thermodynamic stabilizing substitutions (Dl 1IE and Q299R), and four HA cleavage site substitutions (Y342H, Y342C, Y342N and Y342F) (Fig. 31). Y174H, D238G, Y342H, Y342C, and Y342N, which are expected to be deleterious under the experimental condition (see footnote in Fig. 31), have a relatively low RF index (ranging from 0.04 to 0.23). On the other hand, Dl 1IE, Q299R, and Y342F, which are expected to be neutral under the experimental condition, have a relatively high RF index (ranging from 0.37 to 1.03). The dataset and the experimental results reported in the literature can be consistent. [0599] Independent experimental validation also confirmed the dataset. Six randomly selected point mutations were individually reconstructed and analyzed. RF indices of each mutation have a positive correlation with the TCID50 value measured from a rescue experiment (Fig. 29A-29B). The reliability of the fitness profiling data can be verified and demonstrated the platform to be comprehensive and at high resolution.

[0600] Structural analysis of hemagglutinin [0601] The platform can have a high sensitivity for monitoring negative selection in addition to positive selection and can enable the identification of deleterious mutations that disappear throughout viral passaging. The availability of the influenza HA crystal structure can be used to enable further extrapolate structural insights from the dataset. A spearman correlation of 0.30 was observed between the RF index and the relative solvent accessible surface area (SASA) of HA (P < 2 e 16). Surface residues can be more tolerant to substitutions than core residues. The results can be consistent with observations in cellular proteins. The fitness effects of mutations was also analyzed in different types of structural elements, namely α-helices (mean logio RF index = -1.19), ^-strands (mean logio RF index = -0.97), turns (mean logio RF index = -0.98) and coils (mean logio RF index = -1.01). Mutations in a-helices are more deleterious than mutations in ^-strands (P = 1 e 4), turns (P = 1 e 3) and coils (P = 2 e 3). In contrast, the fitness effects of mutations in ^-strands, turns, and coils are not different from each other (P > 0.4). Functional elements in HA can be contained within a-helices. [0602] Each a-helix was also investigated by computing their individual mean logio RF index (see Fig. 4A of Wu et al. Scientific Reports 2014). Consistent with the SASA analysis, the α-helices located at the core of HAi are the least tolerant to mutations (red and pink, mean logio RF index = - 1.52 and - 1.40 respectively). The other a-helix in HAi is also relatively intolerable to mutations (logio RF index = -1.1 1), which is consistent with its role in receptor α binding for viral entry. In HA2, the two -helices located at the stem-loop region can be relatively intolerable to mutations (mean logio RF index = -1.1 1 and -1.22 respectively), which can be attributed to their functional role in membrane fusion during viral entry. All of the mean logio RF indices reported above can be lower than that of the entire HA (mean logio RF index = - 1.04). Alpha-helices in HA can play a role for different functional mechanisms. [0603] The non-structural loop region that interspaces the aforementioned helices can be more tolerant to mutations compared to its neighboring α-helices (mean logio RF index = -0.76) (see Fig. 4A of Wu et al. Scientific Reports 2014). This region can undergo a transition from a non-structural loop to an a-helix during membrane fusion. In some cases, the structural requirement for this transition is not stringent. This can be further evidenced by a proline substitution analysis (see Fig. 4A of Wu et al. Scientific Reports 2014B). Among all 20 standard amino acids, proline can have the poorest a-helix formation propensity as its presence can result in a break or a kink of an a-helix. Proline substitutions in an a-helix can carry a low RF index (deleterious). All proline substitutions in the HA a-helices can have a logio RF index < -1. Two out of three proline substitutions in the non-structural loop have a logio RF index > - 1 (-0.81 and -0.19 respectively). In some cases, formation of a continuous a-helix in this region is not a strict requirement during membrane fusion. [0604] An a-helix that can play a role in homotrimer formation (see Fig. 4A of Wu et al. Scientific Reports 2014). Helix wheel projection showed that hydrophobicity at heptad position d (see Fig. 4C Wu et al. Scientific Reports 2014). The RF index of those amino acid substitutions at heptad position d was investigated (Fig. 31; see also Fig. 4D of Wu et al. Scientific Reports 2014). Silent mutation at G430 had the lowest RF index (0.24) among all silent mutations at this heptad position. This RF index was employed as a reference to identify substitutions that have a relatively neutral fitness effect. In this case, three out of 27 amino acid substitutions at this heptad position have an RF index >0.24, namely Y437F (RF index = 0.35), V465I (RF index = 0.40) and V465A (RF index = 0.30). These three substitutions can be conserved in volume and hydrophobicity. Residues at heptad position d can have a stringent structural constraint in side chain conformation and hydrophobicity for homotrimer formation.

[0605] Fig. 4 of Wu et al. Scientific Reports 2014 shows structural analysis on hemagglutinin. Fig. 4A of Wu et al. Scientific Reports 2014 illustrates all α-helices and a non structural loop in HA as highlighted. Mean logio RF indices for individual highlighted structural elements are shown. The logio RF indices for all observed X P mutations (where X can be any amino acids but P) in individual highlighted structural elements are plotted as stripcharts (see Fig. 4B of Wu et al. Scientific Reports 2014). The colors of the stripcharts match the highlight colors of the corresponding structural elements in panel A. The bottom stripchart represents the non-structural loop that undergoes a-helix formation during membrane fusion. A helical wheel was constructed by DrawCoil 1.0 (HyperTextTransferProtocol ://WorldWide Web .grigoryanlabDOTorg/ drawcoil/, wherein "HyperTextTransferProtocol" is "http", "WorldWideWeb" is "www", and "DOT" is ".") (see Fig. 4C of Wu et al. Scientific Reports 2014). Fig. 30 (also see Fig. 4D of Wu et al. Scientific Reports 2014) shows a bar chart that represents the RF indices of all profiled amino acid substitutions at heptad position d. RF indices of silent mutations are also included for comparison (see also Fig. 31 herein).

[0606] Identification of essential regions [0607] Profiling can provide information to identify essential protein surfaces and indispensable regions useful for vaccine epitopes. The genetic platform can provide the relative fitness effects of an average of five substitutions per amino acid residue. The RF indices of the most destructive substitutions in the dataset can be projected on the HA structure to identify putative functional regions that cannot tolerate certain amino acid substitutions (see Figure 5A and Figure 5B of Wu et al. Scientific Reports 2014). The RF indices of the least destructive substitutions for HA can be projected on the HA structure to identify essential regions that are intolerable to any substitution (see Fig. 5C of Wu et al. Scientific Reports 2014). The trimer formation surface (see Fig. 5A and Fig. 5B of Wu et al. Scientific Reports 2014) and the stem domain (see Fig. 5B and Fig. 5C of Wu et al. Scientific Reports 2014), which can be a functional component of the membrane fusion machinery in HA, show as essential regions in the profiling data. The dataset identified the cross-subtype conserved influenza HA stalk region as an indispensable region (see Fig. 5C and Fig. 5D of Wu et al. Scientific Reports 2014), which is at the binding site of the proposed influenza universal antibody, CR6261. The side-chain interactions at this site can be important for CR6261 recognition. Several missense substitutions in the binding site can be allowed, and they are conservative substitutions (N389D and T392S). In some cases, these missense mutations are unlikely to disrupt antibody recognition (see Fig. 5C and Fig. 5D of Wu et al. Scientific Reports 2014). Antigenic sites on the globular head of HA were largely tolerable to substitutions (see Fig. 5C of Wu et al. Scientific Reports 2014). A functional basis for the tendency of this domain to rapidly undergo genetic drift can be provided, which can adversely affects both natural and vaccine-induced immunity. The methods described herein can provide details on the genetic cost for individual point mutations across HA. This dataset can provide a reference for rational vaccine design.

[0608] Fig. 5 of Wu et al. Scientific Reports 2014 illustrates the essential regions on hemagglutinin. Fig. 5A and Fig. 5B of Wu et al. Scientific Reports 2014 illustrates the RF indices of the most destructive missense substitutions in the profiling data for individual amino acids are projected on the HA protein structure to identify essential regions intolerable to mutations. Fig. 5C of Wu et al. Scientific Reports 2014 shows the RF indices of the least destructive missense substitutions in the profiling data for individual amino acids projected on the HA protein structure to identify essential regions intolerable to mutations. The inset represents the side chain interaction between HA and the proposed influenza universal antibody CR6261 (PDB: 3GBN). Parentheses represent the residue naming according to HA. The mean logio RF indices of non-conservative mutations for each residue are shown. Note that, residue 389 is an aspartic acid in the structure but is an asparagine in the wild type HA sequence. A compatible rotamer for T392 was generated using PyMOL to display the hydrogen bond. All hydrogen bonds (black dotted lines) are displayed as described. (A-C) Red: RF index < 0.05; Orange: RF index < 0.1; Green: other. The structure is based on PDB: IRUZ. Fig. 5C of Wu et al. Scientific Reports 2014 illustrates the RF indices for missense mutations within the universal antibody recognition sites. A conservative substitution is defined as having a positive score in the blosum80 matrix.

[0609] Methods [0610] Viral mutant library and point mutations [0611] A plasmid mutant library was created by performing error-prone PCR on the HA segment of the eight-plasmid reverse genetics system of influenza A/WSN/1933 (H1N1). The HA gene insert was PCR-amplified with error-prone polymerase Mutazyme II (Stratagene, La Jolla, CA). The mutation rate of the error-prone PCR was optimized by adjusting the input template amount to avoid the accumulation of deleterious mutations. The restriction enzyme site BsmBI was present in the PCR primers, and used to clone into a BsmBI-digested parental vector pHW2000. Ligations were carried out with high concentration T4 ligase (Life Technologies, Carlsbad, CA). Transformations were carried out with electrocompetent MegaX DH10B T1R cells (Life Technologies), and >200,000 colonies were scraped and directly processed for plasmid DNA purification (Qiagen Sciences, Germantown, MD). As extensive trans-complementation was expected during the transfection step, >35 million cells were used for transfection to average out any bias or artifact generated from possible trans- complementation. Point mutants for the validation experiment were constructed using the QuikChange XL Mutagenesis kit (Stratagene) according to the manufacturer's instructions.

[0612] Transfections, infections, and titering [0613] C227 cells, a dominant negative IRF-3 stably expressing cell line derived from human embryonic kidney (293T) cells, were transfected with Lipofectamine 2000 (Life Technologies) using the HA mutant library plasmid plus 7 other wildtype plasmids. Supernatant was replaced with fresh cell growth medium at 24 hours and 48 hours post-transfection. At 72 hours post-transfection, supernatant containing infectious virus was harvested, filtered through a 0.45 um MCE filter, and stored at -80 degree Celsius. The TCID50 was measured on A549 cells (human lung carcinoma cells). [0614] Virus from the C227 transfection was used to infect A549 cells at an MOI of 0.05. Infected cells were washed three times with PBS followed by the addition of fresh cell growth medium at 2 hours post-infection. Virus was harvested at 24 hours post-infection. For the mutant library profiling, HA mutant library was passaged for two 24-hour rounds in A549 cells. The pilot experiments revealed that two rounds of passaging were sufficient for profiling. The biological duplicate was performed by an independently transfected viral library, followed by two rounds of passaging as described above.

[0615] Sequencing librarypreparation [0616] Viral RNA was extracted from the passaged viral mutant library using QIAamp Viral RNA Mini Kit (Qiagen Sciences) and was reverse transcribed to cDNA using Superscript III reverse transcriptase (Life Technologies). DNA from the plasmid library or cDNA from the passaged viral mutant library were amplified with both forward and reverse primers each flanked with a 6 "N" tag and the Illumina flow cell adapter region. Flanking region for 5' primer: 5'-CTA CAC GAC GCT CTT CCG ATC TNN NNN N-3' (SEQ ID NO: 89), Flanking region for 3' primer: 5'-TGC TGA ACC GCT CTT CCG ATC TNN NNN N-3' (SEQ ID NO:

90). Following PCR, 12 amplicon products were pooled together. 1.5 million copies of the pooled product were used as the input for the second PCR, which was equivalent to 10 paired- end reads per molecule if 15 million paired-end reads were sequenced. 5'-AAT GAT ACG GCG ACC ACC GAG ATC TA CAC TCT TTC CCT ACA CGA CGC TCT TCC G-3' (SEQ ID NO: 91) and 5'-CAA GCA GAA GAC GGC ATA CGA GAT CGG TCT CGG CAT TCC TGC TGA ACC GCT CTT CCG-3' (SEQ ID NO: 92) were used as the primers for the second PCR. Products of the second PCR were submitted for next generation sequencing. The error- correction technique described herein is similar for detecting rare mutations in human cells. However, the method described herein included the fine restraint of limiting the input tagged template copy number and PCR efficiency during the second step PCR to accurately control the distribution of cluster size in the sequencing output to a median of 10. Raw sequencing data have been submitted to the NIH Short Read Archive under accession number: BioProject PRJNA243038.

[0617] Data analysis [0618] Sequencing reads were mapped by BWA with a maximum of six mismatches and no gap. Amplicons with the same tag were collected to generate a read cluster. Since each read cluster was originated from the same template, true mutations were called only if the mutations occurred in 90% of the reads within a read cluster. In some instances, this error-correction approach would only correct errors that occurred during the deep sequencing process but not those that were introduced during the reverse transcription process. Read clusters with a size below three reads were filtered out. Read clusters were further conflated into "error-free" reads. Average coverages in terms of "error-free" reads were 177028 per nucleotide in the plasmid mutant library, 112355 per nucleotide in replicate 1 of passaged viral mutant library, and 161773 per nucleotide in replicate 2 of passaged viral mutant library (Fig. 32A). Relative fitness index (RF index) for individual point mutations was computed by: (occurrence frequency in passaged library)/(occurrence frequency in plasmid library) [0619] For all the downstream analysis, only point mutations covered with >30 tag-conflated reads ("error-free" reads) in the plasmid library were included. This arbitrary cutoff filtered out mutants with low statistical confidence, which is about 16% of all possible point mutations (Fig. 32B). In addition, all C → A and G → T mutations are not included in the reported dataset due to an observed DNA oxidative damage during library preparation. The RF index was calculated by averaging all RF indices available for a given amino acid substitution.

[0620] Structural analysis [0621] The solvent accessible surface area (SASA) for individual residues was computed from PyMOL using the default "get area" function. SASA obtained from the folded structure was then normalized with the SASA calculated from an unfolded structure to obtain the relative SASA. Secondary structure assignment was performed by STRIDE. The structural analysis was based on PDB: IRUZ. A two-tailed Student's t-test was employed to compare the loglO RF indices in different types of structural elements. Missense mutations are included in the analysis unless otherwise stated.

[0622] Discussion [0623] Identification of residues essential for viral replication can be inferred by sequence conservation. Observed sequence conservation can derive from the viral sequences that initiate can endemic, and can be influenced by the host genetic background and the specific immune responses associated with the host. In some instances, conservation is not equivalent to essentialness for viral replication in cells. Mutational analysis of conserved amino acid residues on influenza A virus can reveal that a fraction of conserved residues can be dispensable in viral replication. New mutations can emerge every flu season. In some cases, a certain portion of residues that are conserved currently are still capable to mutate in the natural environment and provide a fitness advantage under future unforeseen selection pressures. In some instances, a conserved amino acid is not essential to viral replication. In some instances, analyses of conserved sequences provide information on viral genetic elements that survived in the selected human population in recent history, but does not provide information on viral genetic elements that were unable to survive the selection process, nor about which host factor was responsible for exerting the selection. In some embodiments, a method described herein provides a complementary, yet more direct approach to identify amino acid residues that are involved in viral replication in a defined cellular environment.

[0624] A method described here can enable the simultaneous functional profiling of point mutations across the entire influenza HA at single-nucleotide resolution to determine their roles in viral replication. A method described herein can provide an efficient tool to address biomedical questions. The fitness profiling data can allow the study of structure-function relationships at single-amino acid resolution. It can enable the search for essential protein surfaces on available structures and can offer a reference for drug design approaches that aim to increase the genetic barrier for the emergence of escape mutations. Essential peptide stretches can also provide potential targets for drug and vaccine development. The genetic platform can be applied to study viral genome dynamics and identify residues for virus-host interactions in a specific cellular responses (such as apoptosis, autophagy, inflammasome induction, ER stress, etc.) and immune responses (such as NK cells, T cells, antibodies, macrophages, cytokines, etc.). The current development of a live attenuated influenza vaccine has been based on the modification of NS1 to increase interferon sensitivity. This study provides a platform for alternative strategies. Comparing the in vitro fitness profile with an in vivo profile can also permit the identification of mutants that replicate efficiently in vitro but not in vivo. The resultant information when coupled with known mutants that are sensitive to a specified immune response can help achieve a higher titer during vaccine production, but exhibit an attenuated phenotype after injection into the human body where an intact immune system is present. The platform can be applicable to other viral or microbial genomes where genetic manipulation is available in the laboratory.

[0625] EXAMPLE VIII: HIGH-THROUGHPUT FUNCTIONAL ANNOTATION OF INFLUENZA A VIRUS GENOME AT SINGLE-NUCLEOTIDE RESOLUTION

[0626] A genome -wide genetics platform is presented which can permit functional interrogation of all point mutations across a viral genome in parallel. A fitness profile of individual point mutations across an influenza virus genome was generated. Residues on the viral genome were systematically identified, which provided a collection of subdomain data informative for structure-function studies and for effective rational drug and vaccine design. Data were consistent with known, well-characterized structural features. A validation rate of 68% for severely attenuated mutations and 94% for neutral mutations was achieved. The approach described in this example is applicable to other viral or microbial genomes where a means of genetic manipulation is available. See Wu et al. bioRxiv 2014), which is incorporated by reference in its entirety.

[0627] Results [0628] Quantification of thefitness effect of individualpoint mutation [0629] A high-throughput genetics platform provided herein can randomly mutagenize each nucleotide of the genome and monitor the changes in occurrence frequency for individual point mutations under specified growth conditions using massive deep-sequencing. The changes in occurrence frequency of each point mutation (such as diminishment or enrichment) allow quantification of the mutational fitness outcomes under the given growth conditions. The mutant libraries were created by error-prone PCR on the eight-plasmid reverse genetics system influenza A/WSN/1933 (H1N1). Subsequently, 8 viral mutant libraries were generated by transfection, each with one of the 8 segments mutagenized. All viral mutant libraries were passaged for two 24-hour rounds in A549 cells (human lung epithelial carcinoma cells). The plasmid library and the passaged viral library were each sequenced by Illumina HiSeq 2000. A relative fitness index (RF index) can be used to estimate the mutational fitness effect. The RF index is calculated as: RF index = (occurrence frequency in passaged library)/(occurrence frequency in plasmid library). The occurrence frequency of individual mutations was expected to be lower than the sequencing error rate (0.1%-1%) in next generation sequencing (NGS). A two-step PCR approach for sequencing library preparation was used to distinguish true mutations from sequencing errors. In the first PCR, a unique tag was assigned to individual molecules. The second PCR generated multiple identical copies for individual tagged molecules. The input copy number for the second PCR was well-controlled such that individual tagged molecules would be sequenced about 10 times. True mutations would exist in all sequencing reads sharing the same tag, whereas sequencing errors would not. Individual molecules, each carrying a unique 79 tag, have an average copy number of about 10 in the sequencing data, which validated the sequencing library preparation design (Fig. 33).

[0630] Point mutationfitness profiling of influenza A virus genome

[0631] The RF indices for individual point mutations were profiled across 96%> of nucleotide positions in the influenza A/WSN/1933 virus genome (see Fig. 1 of Wu et al. bioRxiv 2014,). Fig. 1 of Wu et al. bioRxiv 2014 shows single-nucleotide resolution fitness profiling. The RF index for individual point mutations across the genome was computed. Natural log of RF index, which is the ratio of occurrence frequency in the passaged library to the occurrence frequency in the plasmid library, represents the y-axis. Each nucleotide position is represented by four consecutive lines for the RF index that correspond to mutating to A, T, C, or G. The RF index of WT nucleotides is set as zero. Only point mutations with a coverage of > 30 tag-conflated reads in the plasmid library are shown. Point mutations with < 30 tag-conflated reads in the plasmid library are plotted as a gray dot on the zero baseline. The remaining 4% of nucleotide were from the termini of each gene segment due to PCR amplification difficulty. A positive correlation can exist between RF index and the degree of amino acid conservation of missense mutations (Fig. 34). In addition, the fitness data for well-characterized mutants were consistent with their phenotypes reported in the literature. Examples include a salt bridge for viral replication on nucleoprotein (NP) (Fig. 35A), replication enhancement mutation on polymerase subunit (PB2) (Fig. 35B), attenuation of oseltamivir resistance mutation on neuraminidase (NA) (Fig. 35C), low fitness cost of amantadine/rimantadine resistance mutations on ion channel (M2) (Fig. 35D), and the basic stretch on matrix protein (Ml) required for assembly (see Supplemental Figure 4 of Wu et al. bioRxiv 2014). Supplemental Figure 4 of Wu et al. bioRxiv 2014 shows a structural analysis of Ml. Supplemental Fig. 4A of Wu et al. bioRxiv 2014 shows the RF index of the least destructive missense mutations for individual amino acids on the M l segment projected on the protein structure (PDB: 1EA3) to identify indispensable regions.

Supplemental Fig. 4B shows the residues 76RRR78 displayed in stick format as an inset. It has been suggested that this basic amino acid stretch is important for virus assembly and/or budding. Virus substitutions at these positions show an attenuated phenotype. The non-structural region at the C-terminal end of 76RRR78 is also indispensable in the profiling data. This suggests that entire the non-structural region containing the 76RRR78 basic stretch can be functionally important. One possibility for functional importance is that it provides an interface for a protein-protein interaction. Furthermore, comparison between the fitness data with the polymerase activity on 19 PB1 mutants previously reported showed an 80% correlation. Mutants that displayed a severely attenuated (RF index <0.05) or neutral (RF index >0.4) phenotype were randomly selected across the genome, individually constructed and tested. The replication phenotype of each single mutant validated the profiling data with a confirmation rate of 68% for severely attenuated mutations and 94% for neutral mutations (Fig. 36). These data taken together can provide validity to the fitness profiling data set.

[0632] Structural analysis and identification of indispensable protein surface [0633] The high-throughput profiling technique can provide a basis to identify essential protein surfaces for drug targeting and indispensable regions for vaccine epitopes. A structural analysis on NA, a major influenza vaccine antigen, was performed. In some instances, a cluster of essential residues at the tetramer formation interface was identified, suggesting that it bears functional importance and can possibly be a drug targeting site. Such a large cluster of essential residues could not be found in any other part of the NA surface. The lack of essential residues on the NA surface can explain the functional basis of antigenic drift. Fig. 3 of Wu et al. bioRxiv 2014 shows structural analysis of the NA homotetramer interface. The RF index of the least destructive missense mutations for individual amino acids on the NA segment were projected on the protein structure (PDB: 3CL0) to identify for essential regions. The RF index ranges from < 0.1; 0.1< RF index < 0.2; and uncovered. [0634] In some instances, a structural analysis was performed using the PA subunit of the influenza virus RNA polymerase as an example to search for indispensable regions to aid in rational drug design. PA can be a valuable target for drug development because PA can be polyfunctional. The fitness data provided an informative reference for rational drug design. It captured several interactions between PA and PB1, such as the hydrogen bond between PA E617 and PB1 K l 1 (see Fig. 4A of Wu et al. bioRxiv 2014), and the hydrophobic interaction between PA and PB1 via the volume-filling residues L666 and F710 (see Fig. 4B of Wu et al. bioRxiv 2014). It also revealed a cluster of essential residues on the PA surface consisting of 8 amino acids (see Fig. 4C of Wu et al. bioRxiv 2014), including K539 and K574, which were previously shown to be part of a lead compound binding pocket. This patch of amino acids can be involved in a protein-protein interaction for viral replication. Similar analyses using the dataset have been applied to PA endonuclease domain and the M2 ion channel, which can be targets in drug development (See Supplemental Figs. 5 and 6 of Wu et al. bioRxiv 2014). Supplemental Fig. 5 of Wu et al. bioRxiv 2014 shows structural analysis of the PA endonuclease 375 domain. The RF index of the least destructive missense mutations in the profiling data for individual amino acids on the PA segment are projected on the PA endonuclease crystal structure (PDB: 4E5G). A helix-helix interface, which consists of T40, V44, M47, 1171, R174, and 1178, is highlighted. It demonstrates the power of qHRG in identifying residues that are not continuous in the primary sequence. [0635] Supplemental Fig. 6 of Wu et al. bioRxiv 2014 shows structural analysis of the M2 ion channel. The RF index of the least destructive missense mutations in the profiling data for individual amino acids on the M2 protein are projected on the M2 ion channel crystal structure (PDB: 2RLF). An indispensable region on the transmembrane helix is highlighted. The data captured the essential amino acids W41 and H37, which are involved in M2 ion channel activation. Several adjacent hydrophobic residues, 135, L36, and L38 were also identified as relevant residues, which can be attributed to their contact with the hydrophobic membrane. [0636] Projecting the fitness profiling data on three dimensional protein structures can enable the identification of novel putative essential structural motifs that can be surface exposed but not necessarily sequential in the primary sequence. This type of analysis can reveal biological targets useful for rational drug and vaccine design. In some embodiments, a method described herein is utilized for antiviral drug design with in silico drug screening to increase the efficiency of therapeutic identification. [0637] Fig. 4 of Wu et al. bioRxiv 2014 shows structural analysis of the RNA polymerase PA subunit. The RF index of the least destructive missense mutations in the profiling data for individual 321 amino acids on the PA segment are projected on the PA-PB1 complex crystal structure (PDB: 2ZNL). The fitness data is capable of identifying several relevant interactions and putative functional sites. Fig. 4A of Wu et al. bioRxiv 2014 shows a hydrogen bond between PA E617 and PB1 Kl 1. Substitution of PA E617 is deleterious in the fitness data. Fig. 4B of Wu et al. bioRxiv 2014 shows A hydrophobic interaction is shown between PA L666 and F710 and PB1. Substitution of L666 is deleterious in the fitness data. Fig. 4C of Wu et al. bioRxiv 2014 shows a cluster of 8 essential residues on the surface of PA.

[0638] Materials and Methods [0639] Viral mutant library and point mutations [0640] Plasmid mutant libraries were created by performing error-prone PCR on the eight- plasmid reverse genetics system of influenza A/WSN/1933 (H1N1). The flu insert with error- prone polymerase Mutazyme II was PCR amplified (Stratagene, La Jolla, CA). Mutation rate of the error-prone PCR was optimized by adjusting the input template amount to avoid the accumulation of deleterious mutations. The restriction enzyme sites BsmBI and/or Bsal were added to the PCR primers, and used to clone into a BsmBI-digested parental vector pHW2000. Ligations were carried out with high concentration T4 ligase (Invitrogen, Grand Island, NY). Transformations were carried out with electrocompetent MegaX DH10B TIR cells (Invitrogen), and > 100,000 colonies for each segment library were scraped and directly processed for plasmid DNA purification (Qiagen Sciences, Germantown, MD). As extensive trans- complementation was expected during the transfection step, > 35 million cells were used for transfection to average out any bias or artifact generated from possible trans-complementation. Point mutants for the validation experiment were constructed using the QuikChange XL Mutagenesis kit (Stratagene) according to the manufacturer's instructions.

[0641] Transfections, infections, and titering [0642] C227 cells, a dominant negative IRF-3 stably expressing cell line derived from human embryonic kidney (293T) cells, were transfected with Lipofectamine 2000 (Invitrogen) using 7 wildtype plasmids plus 1 mutant (library) plasmid. Supernatant was replaced with fresh cell growth medium at 24 hours and 48 hours post-transfection. At 72 hours post-transfection, supernatant containing infectious virus was harvested, filtered through a 0.45 um MCE filter, and stored at -80°C. The TCID50 was measured on A549 cells (human lung carcinoma cells). [0643] Virus from C227 transfection was used to infect A549 at an MOI of 0.05 . 181 Infected cells were washed three times with PBS followed by the addition of fresh cell growth medium at 2 hours post-infection. Virus was harvested at 24 hours post-infection. For the mutant library profiling, all viral mutant libraries were passaged for two 24-hour rounds in A549 cells. Two rounds of passaging were sufficient for profiling.

[0644] Sequencing librarypreparation [0645] DNA from the plasmid library or cDNA from the passaged viral mutant library were amplified with both forward and reverse primers each flanked with a 6 "N" tag and the flow cell adapter region. Flanking region for 5' primer: 5'- CTACACGACGCTCTTCCGATCTNNNNNN-3' (SEQ ID NO: 89), Flanking region for 3' primer: 5'-TGCTGAACCGCTCTTCCGATCTNNNNNN-3 ' (SEQ ID NO: 90). Following

PCR, 93 amplicon products were pooled together. 15 million copies of the pooled product were used as the input for the second PCR, which was equivalent to 10 paired-end reads per molecule if 150 million paired-end reads (approximately one lane on an Illumina HiSeq 2000 machine) were sequenced. 5'- AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCG-3' (SEQ ID NO: 91) and 5'- CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCC G-3' (SEQ ID NO: 92) were used as the primers for the second PCR. Products from the second PCR were submitted for NGS. The error-correction technique described in this study adapted the philosophy described for detecting rare mutations in human cells. Raw sequencing data have been submitted to the NIH Short Read Archive under accession number: SRRl 042008 (plasmid mutant library) and SRRl 042006 (passaged mutant library).

[0646] Data Analysis [0647] Sequencing reads were mapped by BWA with a maximum of six mismatches and no gap. Amplicons with the same tag were collected to generate a read cluster. Since each read cluster was originated from the same template, true mutations were called only 205 if the mutations occurred in 90% of the reads within a read cluster. Read clusters with a size below three reads were filtered out. Read clusters were further conflated into "error-free" reads. Relative fitness index (RF index) for individual point mutations was computed by: [0648] (occurrence frequency in passaged library)/(occurrence frequency in plasmid library) [0649] For all the downstream analysis, only point mutations covered with > 30 tag- conflated reads ("error-free" reads) in the plasmid library were included. This arbitrary cutoff filtered out mutants with low statistical confidence.

[0650] Discussion [0651] Sequence conservation can be taken as the sole parameter for identifying residues essential for viral replication. In some cases, conservation is not equivalent to essentialness for viral replication. A fraction of conserved residues that are conserved in the influenza A virus are dispensable in viral replication. In addition, new mutations can be observed in every flu season. Residues that are naturally conserved currently can still mutate under future unforeseen selection pressures. A high-throughput fitness profiling can complement a shortcoming in the sequence conservation analysis and can allow identification of amino acid residues that play a role in viral replication in a defined cellular environment.

[0652] Described herein is a method to profile the entire influenza A virus genome at single- nucleotide resolution. The fitness effects of individual point mutations were interrogated in a high-throughput manner by coupling a large mutant library with NGS. Similar experiments can be performed with strains across subtypes to identify mutations that display a genetic background-dependent fitness effect. These results can provide information to dissect the evolutionary process of the influenza A virus. This platform can be applied to study the virus- host interaction under different cellular responses (such as apoptosis, autophagy, inflammasome induction, ER stress, etc.) and immune responses (such as NK cells, T cells, antibodies, macrophages, cytokines, etc.) that influence the viral replication in turn. Such results can improve understanding of the biological role of each residue on the genome of the influenza A virus. They can improve the design of a live attenuated influenza vaccine, e.g., by minimizing the virulence. The platform can be adapted to other virus and microbes that can be genetically manipulated in the laboratory.

[0653] EXAMPLE IX: VACCINE FORMULATION COMPRISING A PLURALITY OF EPITOPES [0654] Vaccineformulation comprising threepeptides [0655] An influenza A virus vaccine is formulated to include peptides RGINDRNFW (SEQ ID NO: 26), RWINDRNFW (SEQ ID NO: 27), and WHSNLNDATYQRTRALV (SEQ ID NO: 33). Additional components in the vaccine include a CD4+ (helper) T cell epitope and a Pam2Cys moiety. Each peptide is bridged to a CD4+ T cell epitope by a lysine residue and the Pam2Cys moiety is connected to the lysine residue through two serines. The vaccine is further formulated to comprise an additional adjuvant, AIOOH alum. A liposome comprising cholesterol, egg-yolk phosphatidylcholine (PC), and dicetylphosphate (DCP) (at 5/4/1 molar ratio) is formulated as a carrier for delivery of the peptide vaccine.

[0656] Vaccineformulation comprising 53 peptides [0657] An influenza A virus vaccine is formulated to include peptides with seqeunces SEQ ID NOs: 1-53. Additional components in the vaccine include a CD4+ (helper) T cell epitope and a Pam2Cys moiety. Each peptide is bridged to a CD4+ (helper) T cell epitope by a lysine residue and the Pam2Cys moiety is connected to the lysine residue through two serines. The vaccine is further formulated to comprise an additional adjuvant, AIOOH alum. A liposome comprising cholesterol, egg-yolk phosphatidylcholine (PC), and dicetylphosphate (DCP) (at 5/4/1 molar ratio) is formulated as a carrier for delivery of the peptide vaccine.

[0658] EXAMPLE X : IMMUNIZATION IN MICE [0659] Mice (6-8 week old BALB/c) are maintained in specific pathogen-free conditions and are immunized once either by i.n. route with 50µ of vaccine dispensed onto the nare of the mice under light penthrane anaesthesia resulting in deposition in both the upper and lower respiratory tract, by the sub-cutaneous (s.c.) route with 100 µΐ of vaccine administered at the base of the tail, or by the intramuscular (i.m.) route with 25 µΐ of vaccine administered to each thigh. [0660] One month post vaccination, mice are inoculated i.n. with 104.5 plaque-forming units (pfu) of infectious virus under light penthrane anaesthesia. On day five post-infection, mice were killed by cervical dislocation and lungs collected in serum free RPMI 1640 (Gibco, Gaithersburg, MD, USA) supplemented with antibiotics. [0661] The virus includes Mem-Bel (H3N1) virus, a genetic reassortant bearing the NA of A/Bellamy/42 (HlNl) virus and the remaining genes of the A/Memphis/ 1/71 (H3N2) virus, and 2009 HlNl virus strain A/Swine/Iowa/30 (Sw/30). All experiments involving the 2009 HlNl virus (pandemic) are conducted under Biosafety Level 3 (BSL-3) conditions for animal work and Biosafety Level 2 with BSL-3 practices laboratory conditions for in vitro work, in accordance with guidelines of the Centers for Disease Control and Prevention.

[0662] Assay of virus-specific antibody production [0663] Mice are bled via the tail vein on the day prior to challenge and serum samples are collected to examine virus-specific antibody by enzyme-linked immunosorbent assay (ELISA) or hemagglutination inhibition (HI). The ELISA assay utilizes polyvinyl flat-bottom microtitre plates (Dynex Technologies Inc., VA) coated with 5 µg/ml NP and/or HA equivalents of virus. The absorbance of the solutions is then determined using a Labsystems Multiscan Multisoft microplate reader (Lab-systems, Helsinki, Finland). The optical density is calculated as the absorbance reading at (405 nm-450 nm). Antibody titres are expressed as the reciprocal of the serum dilution giving an optical density of 0.2, which represents 5-times the background level. The HI assay is performed in a microtitre format with 1% chicken red blood cells by standard methods.

[0664] EXAMPLE XI: CLINICAL STUDY [0665] The study will enroll up to 400 healthy adults ages 18 and older. Two hundred individuals will be 18-64 years old, and the other 200 will be greater than or equal to 65 years of age. Participants will be randomly assigned to 1 of 2 possible vaccine groups: group 1 will receive 15 meg of an influenza A virus vaccine; group 2 will receive 30 meg of an influenza A virus vaccine. Both groups will receive vaccine injections on days 0 and 2 1 in the arm muscle. Study procedures include: medical history, physical exam, maintaining a memory aid, and blood sample collection. Participants will be involved in study related procedures for approximately 7 months. [0666] Primary outcome measures include number of participants reporting vaccine- associated serious adverse events, solicited local reactions based on the functional grading scale after the first and second vaccinations, measured injection site reactions of swelling and redness after the first and second vaccinations, solicited systemic reactions based on the functional grading scale after the first and second vaccinations, and fever after the first and second vaccinations, and antibody titer increases against an influenza A virus at 8-10 days, or 2 1 days following 1 dose of vaccine. Secondary outcome measures include antibody titer increases against an influenza A virus at 8-10 days, or 2 1 days following 2 doses of vaccine.

[0667] EXAMPLE XII: ADMINISTRATION ROUTE OPTIMIZATION EXPERIMENT [0668] Three administration routes: intraperitoneal, intramuscular, and intranasal, will be tested in this experiment. Each route or group will have twelve female C57B1/6 mice (6-12 weeks old). Within each group, the following conditions will be tested: no adjuvant, rod, alum, and Pam2Cys. Three mice will be tested with each of the conditions. In total, 36 mice will be used with this experiment. [0669] Each mouse will be immunized as a single immunization containing 40mg total of peptide (e.g., one or more of the peptides described herein), or as a consecutive immunization (e.g., as prime-boost). For single immunization, the mice will be immunized and then after 4 weeks, will be sacrificed. For prime-boost method, the mice will be immunized followed by a boost two weeks later. After a total of 4 weeks, the mice will be sacrificed. [0670] At the time of euthanization, spleens and lungs will be harvested. Harvested lungs will be homogenized. CD8+ T cells will be isolated from harvested spleens and/or lungs utilizing an affinity purification method or a size exclusion method. The isolated CD8+ T cells will further be ex vivo stimulated by cognate or irrelevant peptides overnight. Standard IFN-a ELISPOT will be used to detect an immune response to the tested peptides (e.g., one or more of the peptides described herein). Samples from each mouse will be tested about four times with a 1:2 dilution.

[0671] EXAMPLE XIII: ADJUVANT COMPARISON EXPERIMENT [0672] Female C57B1/6 mice (6-12 weeks old) will be used in this experiment. A total of seven conditions will be tested: no adjuvant, alum, rod, rod + monophosphoryl lipid A (MPL) + R837 (imiquimod), MPL + R837, alum + MPL + R837, or Pam2Cys. Each condition will have six mice. The administration route will be based on results obtained from Example XII, and can include intraperitoneal, intramuscular, or intranasal route. [0673] Each mouse will be immunized as a single immunization containing 40mg total of peptide (e.g., one or more of the peptides described herein), or as a consecutive immunization (e.g., as prime-boost). For single immunization, the mice will be immunized and then after 4 weeks, will be sacrificed. For prime-boost method, the mice will be immunized followed by a boost two weeks later. After a total of 4 weeks, the mice will be sacrificed. [0674] At the time of euthanization, spleens and lungs will be harvested. Harvested lungs will be homogenized. CD8+ T cells will be isolated from harvested spleens and/or lungs utilizing an affinity purification method or a size exclusion method. The isolated CD8+ T cells will further be ex vivo stimulated by cognate or irrelevant peptides overnight. Standard IFN-a ELISPOT will be used to detect an immune response to the tested peptides (e.g., one or more of the peptides described herein). Samples from each mouse will be tested about four times with a 1:2 dilution. [0675] When values are provided, it can be understood that each value can be expressed as "about" a particular value or range. "About" can also include an exact amount. For example, "about 5 µ " includes "5 µ ". Generally, the term "about" can include an amount that would be expected to be within 10% of a recited value. [0676] As used herein, the terms "protein", "polypeptide", and "peptide" are used interchangeably to refer to two or more natural and/or unnatural amino acids linked together and one letter amino acid designations used in the sequences. [0677] The section headings used herein are for organizational purposes only and are not to be construed as limiting the subject matter described. [0678] Unless defined otherwise, all technical and scientific terms used herein can have the same meaning as is commonly understood by one of skill in the art to which the claimed subject matter belongs. The descriptions herein are exemplary and explanatory only and are not restrictive of any subject matter claimed. In this application, the use of the singular can include the plural unless specifically stated otherwise. As used in the specification and the appended claims, the singular forms "a", "an", and "the" can include plural referents unless the context clearly dictates otherwise. In this application, the use of "or" can mean "and/or" unless stated otherwise. As used herein, "and/or" means "and" or "or". For example, "A and/or B" means "A, B, or both A and B" and "A, B, and/or C" means "A, B, C, or a combination thereof and said "combination thereof means "A and B, A and C, or B and C". Furthermore, use of the term "including" as well as other forms, such as "include", "includes", and "included", are not limiting. [0679] To the extent necessary to understand or complete the disclosure of the present invention, all publications, patents, and patent applications mentioned herein are expressly incorporated by reference therein to the same extent as though each were individually so incorporated. [0680] Having thus described exemplary embodiments of the present invention, it should be noted by those skilled in the art that the within disclosures are exemplary only and that various other alternatives, adaptations, and modifications may be made within the scope of the present invention. Accordingly, the present invention is not limited to the specific embodiments as illustrated herein, but is only limited by the following claims. WHAT IS CLAIMED IS:

1. A peptide comprising, consisting essentially of, or consisting of an amino acid sequence selected from the group consisting of a) a sequence selected from the group consisting of SEQ ID NOs: 1-53; b) a sequence selected from the group consisting of SEQ ID NOs: 1-27, 29-38, and 40-53 and is at most 50 amino acids in length;

c) a sequence having at least 70% sequence identity to at least 15 contiguous amino acids of a sequence selected from the group consisting of SEQ ID NOs: 47-53 and is at most 50 amino acids in length; d) a sequence having at least 70% sequence identity to at least 8 contiguous amino acids of a sequence selected from the group consisting of SEQ ID NOs: 1-27, 32, and 40-46 and is at most 50 amino acids in length;

e) a sequence having at least 70%> sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-27, 32-38, and 40-53 and is at most 50 amino acids in length;

f a sequence having at least 70%> sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-2, 5-15, 18-22, 26-32, and 39-46 and is less than 15 amino acids in length;

g) a sequence having at least 90%> sequence identity to SEQ ID NO: 29 or 3 1 and is at most 50 amino acids in length; and

h) a sequence having at least 70%> sequence identity to SEQ ID NO: 28 and is 8 amino acids in length.

2. The peptide of claim 1, wherein the peptide is attached to a lipid.

3. The peptide of claim 2, wherein the lipid comprises a palmitoyl group.

4. The peptide of any one of the preceding claims, wherein the peptide is attached to a CD4+ (helper) T cell epitope.

5. The peptide of any one of the preceding claims, wherein the peptide is an isolated peptide.

6. A composition comprising, consisting essentially of, or consisting of one or more peptides according to any one of claims 1-5. 7. The composition of claim 6, and further comprising a second peptide comprising, consisting essentially of, or consisting of an amino acid sequence that is not more than 50 amino acids in length and has a) at least 70% sequence identity to at least 8 contiguous amino acids residues of a equence selected from the group consisting of SEQ ID NOs: 1-53; b) 100% sequence identity to at least 8 contiguous amino acids residues of a sequence selected from the group consisting of SEQ ID NOs: 1-53;

c) at least 70%> sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-53;

d) 100% sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53; or e) a sequence selected from the group consisting of SEQ ID NOs: 1-53.

8. The composition of claim 7, wherein the second peptide is attached to a lipid.

9. The composition of claim 8, wherein the lipid comprises a palmitoyl group.

10. The composition of claim 8 or claim 9, wherein the second peptide is attached to a CD4+ (helper) T cell epitope.

11. A nucleic acid molecule which encodes a peptide according to claim 1.

12. The nucleic acid molecule of claim 11, wherein the nucleic acid molecule is isolated.

13. A composition comprising one or more nucleic acid molecules according to claim 11 or claim 12.

15. A protein that binds to a peptide comprising, consisting essentially of, or consisting of

an amino acid sequence having at least 70%> sequence identity to at least 8 contiguous amino acids residues of a sequence selected from the group consisting of SEQ ID NOs: 1-53;

an amino acid sequence having at least 70%> sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 1-53; or an amino acid sequence selected from the group consisting of SEQ ID NOs: 1-53. 16. The protein of claim 15, wherein the protein is an antibody or an antibody fragment.

17. A composition comprising one or more proteins according to claim 16 or claim 17.

18. The composition of any one of claims 6-10, 13, or 17, and further comprising a pharmaceutically acceptable carrier and/or an adjuvant.

19. The composition of any one of claims 6-10, 13, 17, or 18, wherein the composition is formulated for subcutaneous, intramuscular, intranasal, or intradermal administration.

20. A method comprising administering to a subject the composition of any one of claims 6-10, 13, 17, or 18.

2 1. The method of claim 20, wherein the subject is a human.

22. The method of claim 28 or claim 29, wherein the subject is infected with a virus.

23. The method of claim 22, wherein the virus is an influenza virus.

24. The method of claim 23, wherein the influenza virus is influenza A virus, influenza B virus, or influenza C virus.

25. The method of claim 24, wherein the influenza virus is an influenza A virus.

26. The method of any one of claims 20-25, wherein the composition when administered induces cross-protection against one or more subtypes of influenza strains in the subject.

27. The method of claim 26, wherein the composition when administered induces cross- protection against one or more subtypes of influenza A strains in the subject.

28. The method of any one of claims 20-27, wherein the administration is subcutaneous, intramuscular, intranasal, or intradermal.

29. The method of any one of claims 20-28, wherein the method is a vaccination method. 30. A method for selecting a peptide for use in a vaccine against a pathogen, which comprises: a) obtaining a nucleic acid library comprising a plurality of nucleic acid molecules having one or more mutations relative to a naturally-occuring nucleic acid sequence of the pathogen; b) producing a set of pathogens using the nucleic acid library; c) comparing the nucleic acid sequences of the nucleic acid library to the nucleic acid sequences of the set of pathogens; d) selecting a peptide encoded by a nucleic acid sequence found in both the set of pathogens and the nucleic acid library for use in developing the vaccine.

31. The method of claim 30, wherein step c) further comprises comparing the occurrence of each nucleic acid molecule having the one or more mutations in the nucleic acid library to the occurrence in the set of pathogens.

32. The method of claim 30 or claim 31, wherein step d) further comprises selecting the nucleic acid sequence which has a desired relative fitness.

33. The method of claim 32, wherein the desired relative fitness is based on the ability of the pathogen to propagate.

34. The method of any one of claims 30-33, which further comprises performing next generation sequencing of nucleic acids in the nucleic acid library and the set of viruses.

35. The method of claim 34, wherein the next generation sequencing comprises use of reversible dye terminator nucleotides.

36. The method of any one of claims 30-35, wherein step d) further comprises an HLA affinity binding analysis using a computer.

37. The method of any one of claims 30-36, wherein step d) further comprises a sequence conversation analysis using a computer.

38. The method of any one of claims 30-37, wherein the pathogen is a virus.

39. The method of claim 38, wherein the virus is an influenza virus.

INTERNATIONALSEARCH REPORT International application No. PCT/US 15/24563

A. CLASSIFICATION OF SUBJECT MATTER IPC(8) - A6 1 39/12, A61K 39/385, A61K 39/145, A61 K 38/00 (2015.01) CPC - C07K 14/005, A61K 39/145, A61K 38/00 According to International Patent Classification (IPC) or to both national classification and IPC B. FIELDS SEARCHED Minimum documentation searched (classification system followed by classification symbols) IPC(8)- A61K 39/12, A61K 39/385, A61K 39/145. A61K 38/00 (201¾ 01) CPC- C07 14/005, A61 39/145, A61K 38/00

Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched USPC- 424/1 86. , 424/1 96.1 1, 424/206. 1, 530/326 (keyword search, terms below)

Electronic data base consulted during the international search (name of data base and, where practicable, search terms used) PubWEST (USPT, PGPB, EPAB, JPAB), Google Patents/Scholar Search Terms Used: $AYMLERELVRK$, palmitoyi, lipid, CD4, T cells, antibody, nucleic acid, epitope

C . DOCUMENTS CONSIDERED TO BE RELEVANT

Category* Citation of document, with indication, where appropriate, of the relevant passages Relevant to claim No.

US 2009/0191233 A1 (Bonnet et al.) 30 July 2009 (30.07.2009) para [0043), [0066], [0068]- 1, 4/1, 11-13 [0071], SEQ ID NO: 4 2, 3, 4/(2-3), 15-17

US 2007/0066534 A1 (Jackson et al.) 22 March 2007 (22.03.2007) para [0001], [0037], [0039], 2, 3, 4/(2-3), 15-17 [0051], [0061], [0062], [0096], Fig. 1, claims 9, 23, 26, 65

□ Further documents are listed in the continuation o f Box C . □ * Special categories of cited documents: " later document published after the international filing date or priority "A" document defining the general state of the art which is not considered date and not in conflict with the application but cited to understand to be of particular relevance the principle or theory underlying the invention "E" earlier application or patent but published on or after the international "X" document of particular relevance; the claimed invention cannot be filing date considered novel or cannot be considered to involve an inventive "L" document which may throw doubts on priority claim(s) or which is step when the document is taken alone γ cited to establish the publication date of another citation or other , document of particular relevance; the claimed invention cannot be special reason (as specified) considered to involve an inventive step when the document is "O" document referring to an oral disclosure, use, exhibition or other combined with one or more other such documents, such combination means being obvious to a person skilled in the art "P" document published prior to the international filing date but later than document member of the same patent family

Date o f the actual completion of the international search Date o f mailing o f the international search report 11 August 2015 ( 1.08.2015) 2 0 AUG 2015 Name and mailing address of the ISA/US Authorized officer: Mail Stop PCT, Attn: ISA/US, Commissioner for Patents Lee W. Young P.O. Box 1450, Alexandria, Virginia 22313-1450 Facsimile No. 571-273-8300 Form PCT/ISA/2 0 (second sheet) (January 20 15) INTERNATIONAL SEARCH REPORT International application No.

PCT/US 15/24563

Box No. II Observations where certain claims were found unsearchable (Continuation of item 2 of first sheet)

This international search report has not been established in respect of certain claims under Article 17(2)(a) for the following reasons:

Claims Nos.: because they relate to subject matter not required to be searched by this Authority, namely:

I I Claims Nos.: because they relate to parts of the international application that do not comply with the prescribed requirements to such extent that no meaningful international search can be carried out, specifically:

Claims Nos.: 5-10, 18-29 and 34-39 because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a).

Box No. Ill Observations where unity of invention is lacking (Continuation of item 3 of first sheet)

This International Searching Authority found multiple inventions in this international application, as follows: This application contains the following inventions or groups of inventions which are not so linked a s to form a single general inventive concept under PCT Rule 13.1. In order for all inventions to be examined, the appropriate additional .examination fees must be paid.

Group I, claims 1-4, 11-13 and 15-17, directed to a n antigenic peptide or a protein that binds thereto. Group III, claims 30-33, directed to a method for selecting a peptide for use in a vaccine against a pathogen.

The inventions listed a s Groups I and II do not relate to a single special technical feature under PCT Rule 13.1 because, under PCT Rule 13.2, they lack the same or corresponding special technical features for the following reasons:

- Please see extra sheet for continuation -

1 I I As all required additional search fees were timely paid by the applicant, this international search report covers all searchable claims.

. As all searchable claims could be searched without effort justifying additional fees, this Authority did not invite payment of additional fees.

3 . As only some of the required additional search fees were timely paid by the applicant, this international search report covers only those claims for which fees were paid, specifically claims Nos.:

No required additional search fees were timely paid by the applicant. Consequently, this international search report is restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 1-4, 11-13 and 15-17

The additional search fees were accompanied by the applicant's protest and, where applicable, the payment of a protest fee. The additional search fees were accompanied by the applicant's protest but the applicable protest fee was not paid within the time limit specified in the invitation. No protest accompanied the payment o f additional search fees.

Form PCT/ISA/2 10 (continuation of first sheet (2)) (January 201 5) INTERNATIONAL SEARCH REPORT International application No.

PCT/US 15/24563

Continuation of:

Box NO III. Observations where unity of invention is lacking

Group I has the special technical feature of an antigenic peptide comprising, consisting essentially of, or consisting of an amino acid sequence selected from the group consisting of a sequence selected from the group consisting of SEQ ID NOs: 1-53, or a protein that binds to a peptide comprising, consisting essentially of, or consisting of an amino acid sequence having at least 70% sequence identity to at least 8 contiguous amino acids residues of a sequence selected from the group consisting of SEQ ID NOs: 1-53, that is not required by Group II.

Group II has the special technical feature of a method for selecting a peptide for use in a vaccine against a pathogen, that is not required by Group I.

Groups I and II share the common technical feature of a peptide derived from a pathogen for use in a vaccine against a pathogen. However, this shared technical feature does not represent a contribution over prior art, because this shared technical feature is anticipated by US 2009/0191233 A 1 to Bonnet et al., (hereinafter Bonnett).

Bonnett teaches a peptide derived from a pathogen for use in a vaccine against a pathogen (para [0042], [0045]).

As the technical features were known in the art at the time of the invention, they cannot be considered special technical features that would otherwise unify the groups.

Therefore, Groups l-ll lack unity of invention under PCT Rule 13.

NOTE: continuation of No 4 above: Claims 5-10, 18-29 and 34-39 are found to be unsearchable because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a).

NOTE. Claim 22 improperly states dependency from the method of claim 28 or claim 29, because claim 28 and claim 29 lack proper antecedent "the subject" that is required in claim 22. For the purposes of the present invitation, claim 22 is interpreted to be dependent from claim 21.

NOTE: claim 14 is missing.

Form PCT/ISA/2 10 (extra sheet) (January 2015)