Human B-cell Receptor Profiling service at MedGenome Inc. MedGenome Inc. offers unparalleled sequencing service to analyze B-cell receptor (BCR) repertoires from human or mouse species. MedGenome Inc. utilizes the Takara SMARTer BCR Profiling Kit that leverages SMART (Switching Mechanism at 5' End of RNA Template) technology coupled with 5' RACE approach, for unbiased and clonal amplification of BCR repertoire sequences. Repertoires sequenced on Illumina MiSeq platform are analyzed using Takara Bio Immune Profiler software for sensitive and accurate comprehensive BCR profiling. MedGenome Inc. will also support the development of new diagnostic experimental design and advanced bioinformatics analysis.

Table 1: BCR - Sequencing Services Offered at MedGenome Inc.

Source & Required Sequencing Analysis Information Technology Input Type Amount Method Method Obtained

Bulk SMARTer Isolated Cells / RNA IgG/M BCR 10ng - 3μg RNA / Illumina Takara Bio or CDR3, V(D)J sequences Profiling Kit 50-10,000 cells MiSeq PE300 Immune Profiler PBMCs (Human/mouse) (Takara Bio USA)

Cell isolation Template purification Molecule amplification

CD10 CD19 CD24 B cell CD27

CD38 VDJ CH mRNA 5' RACE Chr14 5' AAA 3' 5' mRNA AAA 3' VJ CL

Chr2

Data analysis Data processing Library construction (repertoire diversity (fitering, alignment and sequencing dynamics and clonality) and clustering)

CDR-H3

V1 D1 J1 V2 D2 J2 V3 D3 J3 … … … Vn Dn Jn

Figure 1: Schematics of steps for high-throughput BCR Repertoire Sequencing

MEDGENOME INC. medgenome.com 1 What is B-Cell Receptor (BCR) Repertoire

An adaptive immune system is fundamentally dependent upon the generation of diverse repertoires of B-lymphocytes antigen receptors (BCR). BCRs, a membrane bound cell surface receptors are assembled by genetic rearrangement and somatic recombination of vast immunoglobulin gene segments and are constantly diversified to specifically bind exogenous antigens and endogenous host responses.

V(D)J recombination is a process of tandem arrangement of variable (V), diversity (D) and joining (J) gene segments, that often results in nucleotide insertion and/or between gene segments at the junction (Figure 2). Human BCR undergo recombination at variable region of three gene segments of the immunoglobulin heavy chain (V, D, J) and two gene segments of immunoglobulin light chain locus (V, J).

The collection of B-cells receptors genetically rearranged for different antigen specificity are known as BCR repertoire. V V V V D- J- C C C Stem cells

Gene rearrangement C C C Naive B cells Somatic hypermutation and affinity maturation C C C Affinity-matured B cells Heavy chain

Transcription Light chain Translation

mRNA (immunoglobulin) Figure 2. Gene rearrangement in the B cell receptor Immunoglobulin Heavy Chain

Applications of BCR Repertoire Sequencing

Tracking known Repertoire Disease Sequence Diagnostic Diagnosis and Marker Vaccination Discovery

B-cell Repertoire Analysis

Figure 3. BCR Repertoire Applications

MEDGENOME INC. medgenome.com 2 Tracking known repertoire sequences

Incorporating repertoire sequencing into discovery projects will enable the identification of antibody sequences that are either novel or previously identified. Such knowledge can aid in the development of new therapeutic studies.

Diagnostic Marker Discovery against Infectious Diseases

High throughput repertoire sequencing can provide broad information of disease-specific BCR clones and their dynamic changes in clonality during an infectious state. Sequencing data from antigen-specific clones with stereotyped features in the post-infection repertoires, can provide greater insights and contribute to diagnostic marker discovery against infectious diseases such as H1N1, malaria and COVID-19.

Disease Diagnosis and Vaccination

Repertoire sequencing can provide insight into disease associated antibody repertoire information such as measure of diversity and rate of abundance of different antibody clone sequences. This repertoire sequence data can significantly aid in understanding immunological mechanisms in vaccine development and improve our understanding in correlating repertoire sequence data to immunological assay functional measures.

Major Technologies

Gene Amplification Technology

Takara SMARTer Human BCR Profiling leverages SMART technology (Switching Mechanism at 5’ End of RNA Template) and pairs NGS with a 5’ RACE approach (Figure 4A). cDNA synthesis is dT-primed and full-length cDNA is achieved using MMLV-derived SMARTScribe Reverse Transcriptase (RT) enzyme, that adds SMART UMI oligo’s annealed to non-templated nucleotides at 5’end of each mRNA template. The first strand cDNA is then subjected to two rounds of gene-specific PCR amplification are performed using semi-nested PCR. The nested PCR reduces variability and allows for priming from the constant region of heavy or light chains (Figure 4B). This method generates highly sensitive and reproducible B-cell repertoire profiling, and allow to capture complete V(D)J variable regions of BCR transcripts.

Figure 4. Takara SMARTer Human BCR Profiling kit workflow

MEDGENOME INC. medgenome.com 3 Next Generation Sequencer (Illumina MiSeq)

Nucleotide sequences of B-cell Receptor (BCR) require long read sequencing (400-600 bp) to read and assign the V, D, J and C regions of the BCR transcripts. For this purpose, BCR repertoire is sequenced using Illumina MiSeq (Figure 5).

Figure 5. Illumina MiSeq

BCR Repertoire Analysis

MedGenome Inc. utilized Takara Bio Immune Profiler Software to analyze V, D, J and C regions of the BCR transcripts. The software incorporates two third-party software packages, MIGEC and MiXCR for accurate and reliable clonotype calling and quantification.

A validation study at MedGenome Inc. was performed using total RNA from human spleen as well as B-cell expressing cell lines and PBMCs using the human Takara SMARTer BCR Profiling Kit. Sequencing data was analyzed using Takara Bio Immune Profiler software following recommended guidelines. We provide a representative BCR repertoire report using human spleen total RNA.

A B

5000 5000 Lower 691 Upper Lower 741 Upper

4000 4000

3000 3000

2000 2000

1000 1000 Sample Intensity (Normalized FU) Sample Intensity (Normalized FU) 0 0 Size (nt) Size (bp) 25 50 25 50

100 200 300 400 500 700 Size (bp) 100 200 300 400 500 700 1000 1500 1000 1500 C D

5000 5000 Lower 691 Upper Lower 1000 Upper

4000 4000

3000 3000

2000 2000

1000 1000 Sample Intensity (Normalized FU) Sample Intensity (Normalized FU) 0 0 Size (nt)

25 50 Size (bp) 25 50 100 200 300 400 500 700 Size (bp) 100 200 300 400 500 700 1000 1500 1000 1500

Figure 6. TapeStation traces show representative profiles of B-cell receptor repertoire libraries generated from 10 ng of total RNA from human spleen to specifically amplify the IgG heavy chain (A), IgM heavy chain (B) Kappa (C) and Lambda (D) chains respectively. The libraries were generated using the Takara SMARTer Human BCR IgG IgM H/K/L Profiling Kit and following manufacturers’ instructions. Sequencing was performed using the Illumina MiSeqV3 600 cycle kit, and analysis was performed using the immune profiler pipeline provided by Takara.

MEDGENOME INC. medgenome.com 4 Table 2. Mapping statistics shows specificity of amplification of heavy and light chains in BCR libraries of RNA from human spleen total RNA.

Sample undeter- IGG IGM IGK IGL short flc total Name mined

Human Spleen - 139813(97.0%) 53(0.0%) 93(0.1%) 263(0.2%) 0(0.0%) 3953(2.7%) 0(0.0%) 144175(100.0%) 10ng (IgG) Human Spleen - 55(0.1%) 51847(97.1%) 45(0.1%) 75(0.1%) 0(0.0%) 1393(2.6%) 0(0.0%) 53415(100.0%) 10ng (IgM) Human Spleen - 109(0.1%) 44(0.0%) 152162(97.1%) 93(0.1%) 0(0.0%) 4257(2.7%) 0(0.0%) 156665(100.0%) 10ng (IgK) Human Spleen - 144(0.1%) 80(0.1%) 261(0.2%) 101928(96.0%) 0(0.0%) 3783(3.6%) 0(0.0%) 106196(100.0%) 10ng (IgL) Human Spleen - 2235787(97.3%) 238(0.0%) 540(0.0%) 1988(0.1%) 0(0.0%) 59298(2.6%) 0(0.0%) 2297851(100.0%) 100ng (IgG) Human Spleen - 274(0.0%) 644372(97.6%) 582(0.1%) 291(0.0%) 0(0.0%) 14750(2.2%) 0(0.0%) 660269(100.0%) 100ng (IgM) Human Spleen - 453(0.0%) 186(0.0%) 1155708(97.4%) 334(0.0%) 0(0.0%) 30286(2.6%) 0(0.0%) 1186967(100.0%) 100ng (IgK) Human Spleen - 674(0.0%) 183(0.0%) 293(0.0%) 1546070(96.7%) 0(0.0%) 51057(3.2%) 0(0.0%) 1598277(100.0%) 100ng (IgL)

Table 3. Representative table of final clonotype counts from BCR libraries generated using 10 ng of total RNA human spleen (Top 5 clonotypes) each amplifying the IgG heavy and light chains.

Clone Clone All VHits All DHits All JHits All CHits Sample ID CloneId AA Seq CDR3 Count Fraction With Score With Score With Score With Score

S3034351_IGL_ IGLV2-14*00(2125), 0 2 1 IGLJ1*00(351) IGLC1*00(84.5) CSSYTTSSTYIF mig_cdr3 IGLV2-18*00(1874)

S3034351_IGM_ 0 1 1 IGHV3-7*00 (1462) IGHD3-3*00(40) IGHJ4*00(401) IGHM*00(266) CARSFWRFDYW mig_cdr3

IGHJ4*00(372), IGHG1*00(82.7), S3034352_IGG_ IGHD3-16*00(25), 0 206 0.009528655 IGHV3-48*00 (3179.4) IGHJ5*00(350), IGHG2*00(81.8), CTRGLFENW mig_cdr3 IGHD5-12*00(25) IGHJ1*00(331) IGHGP*00(81)

IGHG1*00(79.8), S3034352_IGG_ IGHD3-16*00(25), IGHJ4*00(293.7), 1 113 0.005226884 IGHV3-7*00 (3466.7) IGHG2*00(79), CEGGGPKADHW mig_cdr3 IGHD3-3*00(25) IGHJ5*00(283.7) IGHGP*00(78.8)

IGHG1*00(88.3), CARDPSGIGVGEL- S3034352_IGG_ IGHD3-10*00(42), 2 68 0.003145381 IGHV3-74*00(3681) IGHJ6*00(332.9) IGHG2*00(86.9), RWGPEWNHLRN- mig_cdr3 IGHD1-14*00(41) IGHGP*00(85.1) KKYGMDVW

CARDHI- S3034352_IGG_ IGHG3*00(84.1), 3 68 0.003145381 IGHV1-18*00(2714.6) IGHD6-6*00(56) IGHJ6*00(347) ATRPQYNYGM- mig_cdr3 IGHG4*00(84.1) DVW

IGHG1*00(81.3), S3034352_IGG_ IGHD3-10*00(30), 4 45 0.002081502 IGHV3-7*00(3946.1) IGHJ4*00(413) IGHG2*00(80.2), CAGETYYYDHW mig_cdr3 IGHD3-16*00(30) IGHGP*00(78.2)

MEDGENOME INC. medgenome.com 5 Clone Clone All VHits All DHits All JHits All CHits Sample ID CloneId AA Seq CDR3 Count Fraction With Score With Score With Score With Score

IGHG1*00(81.2), S3034352_IGG_ IGHV4-59*00(3740.3), CARGWYYYDSS- 5 41 0.00189648 IGHD3-22*00(130) IGHJ5*00(500) IGHGP*00(80.4), mig_cdr3 IGHV4-61*00(3534.5) GYSNWFDPW IGHG2*00(80.3)

IGHG1*00(83.5), S3034352_IGG_ IGHV3-53*00(3353.3), IGHD1-26*00(25), 6 41 0.00189648 IGHJ4*00(411) IGHG2*00(82.8), CTSAPGTFDYW mig_cdr3 IGHV3-66*00(3150.3) IGHD7-27*00(25) IGHGP*00(81.3)

IGHG1*00(87.2), S3034352_IGG_ IGHJ4*00(417.1), CARHLRY- 7 40 0.001850224 IGHV4-39*00(3279.9) IGHD3-22*00(40) IGHG2*00(85.7), mig_cdr3 IGHJ5*00(347.1) DRCLDYW IGHGP*00(82.6)

IGHG1*00(86.3), S3034352_IGG_ IGHV3-30*00(3081.1), CARDGRSCT- 8 39 0.001803969 IGHD2-8*00(41) IGHJ3*00(451) IGHG2*00(85.6), mig_cdr3 IGHV3-33*00(3053.6) VPICHSFYAFDLW IGHGP*00(83.3)

IGHG1*00(73.5), S3034352_IGG_ CAKDSRAGTTGYF- 9 38 0.001757713 IGHV3-9*00(3306.4) IGHD1-7*00(55) IGHJ4*00(401.2) IGHG2*00(72.1), mig_cdr3 DHW IGHGP*00(68.9)

IGHG1*00(84.2), CAKIPVTYYFDIS- S3034352_IGG_ IGHV3-33*00(3550.8), 10 37 0.001711458 IGHD3-22*00(74) IGHJ6*00(335) IGHG2*00(83.7), GYSDPDYYSYFRL- mig_cdr3 IGHV3-30*00(3499.5) IGHGP*00(83.3) DVW

IGHD1-1*00(41), IGHG1*00(89.7), S3034352_IGG_ IGHJ4*00(370.2), CAKTARDWYD- 11 37 0.001711458 IGHV3-23*00(3736.4) IGHD1-20*00(41), IGHG2*00(89.2), mig_cdr3 IGHJ5*00(349.2) EYW IGHD6-19*00(40) IGHGP*00(89.2)

15000

10000

Antibody Chain Clonotype counts 5000 IGG IGK

IGL IGM 0 10 ng 100 ng Human Spleen RNA Input

Figure 7. Bar chart shows the total number of clonotypes identified in Human spleen RNA per Ig chain with different concentrations of RNA.

MEDGENOME INC. medgenome.com 6 Advanced Deliverables

In addition to the standard deliverables using the MixCR pipeline and MedGenome’s pipeline for clonotype comparisons, we also offer advanced bioinformatics and data visualization services using the V(D)J tools by MiLabs.otal RNA.

A

B C 0.100

0.075

0.050

0.025

0.000 30 40 50 60 CDR3 length, bp Clonotype CARYSNYGWTFDNW CARLIEGGANRFDYW CARRIPLYGMDVW CARDVMGRDDYW CVERVGGTLGVW CVRAGLWLPYLAEFDYW CARRPLDHITIFDIVVTRRTWFDTW CTTDHGSSGCLRW CATGEVTKHYYYGMDVW CARVLHYDVWSIYYYVLDVW Other Figure 8. Sample BCR repertoire report with advanced deliverables including (A) Dendrograms and clustering diagrams showing usage of the V(D)J genes across multiple samples, (B) Chord-diagram showing the usage and pairing of the BCR V(D)J genes (C) Spectratype plot of the CDR3 usage for the samples.

MEDGENOME INC. medgenome.com 7 References Georgiou, G., Ippolito, G., Beausang, J. et al. The promise and challenge of high-throughput sequencing of the antibody repertoire. Nat Biotechnol 32, 158–168 (2014). Bolotin, D.A. et al. MiXCR: software for comprehensive adaptive immunity profiling. Nat. Methods 12, 380–381 (2015). Yaari, G. and Kleinstein, S.H. Practical guidelines for B-cell receptor repertoire sequencing analysis. Genome Med. 7:121 (2015).

Shugay M et al. VDJtools: Unifying Post-analysis of T Cell Receptor Repertoires. PLoS Comp Biol 2015; 11(11):e1004503-e1004503

Profiling mouse B-cell receptors with SMART technology, Takara Bio. Retrieved from https://www.takarabio.com/learning-centers/next-generation-sequencing/technical-notes/immune-profiling/bcr-repert oire-profiling-from-mouse-samples-(bulk).

MEDGENOME INC. medgenome.com 8