https://www.alphaknockout.com

Mouse Reg3a Knockout Project (CRISPR/Cas9)

Objective: To create a Reg3a knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Reg3a (NCBI Reference Sequence: NM_011259 ; Ensembl: ENSMUSG00000079516 ) is located on Mouse 6. 6 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 6 (Transcript: ENSMUST00000101272). Exon 2~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 0.19% of the coding region. Exon 2~6 covers 100.0% of the coding region. The size of effective KO region: ~2518 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6

Legends Exon of mouse Reg3a Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(31.3% 626) | C(17.15% 343) | T(31.05% 621) | G(20.5% 410)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(34.05% 681) | C(19.05% 381) | T(27.85% 557) | G(19.05% 381)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr6 + 78379089 78381088 2000 browser details YourSeq 79 424 508 2000 96.5% chr14 + 92517709 92517793 85 browser details YourSeq 31 773 811 2000 97.0% chr4 - 38083535 38083718 184 browser details YourSeq 26 776 811 2000 96.5% chr8 + 42011031 42011068 38

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr6 + 78383607 78385606 2000 browser details YourSeq 39 1227 1364 2000 95.5% chr2 + 105119184 105119368 185 browser details YourSeq 26 1598 1624 2000 100.0% chrX - 49895412 49895439 28 browser details YourSeq 23 1546 1569 2000 100.0% chr5 - 138342567 138342673 107 browser details YourSeq 22 1385 1409 2000 96.0% chr1 + 42063046 42063071 26 browser details YourSeq 20 1139 1158 2000 100.0% chr6 - 79912818 79912837 20 browser details YourSeq 20 1721 1744 2000 91.7% chrX + 49386448 49386471 24

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Reg3a regenerating islet-derived 3 alpha [ Mus musculus (house mouse) ] Gene ID: 19694, updated on 12-Aug-2019

Gene summary

Official Symbol Reg3a provided by MGI Official Full Name regenerating islet-derived 3 alpha provided by MGI Primary source MGI:MGI:109408 See related Ensembl:ENSMUSG00000079516 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AV063448 Expression Biased expression in large intestine adult (RPKM 62.2), duodenum adult (RPKM 50.2) and 1 other tissueS ee more

Genomic context

Location: 6 C3; 6 34.76 cM See Reg3a in Genome Data Viewer

Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (78380709..78383839)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (78330703..78333833)

Chromosome 6 - NC_000072.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Reg3a ENSMUSG00000079516

Description regenerating islet-derived 3 alpha [Source:MGI Symbol;Acc:MGI:109408] Gene Synonyms RegIII (alpha) Location Chromosome 6: 78,380,709-78,383,827 forward strand. GRCm38:CM000999.2 About this gene This gene has 1 transcript (splice variant), 268 orthologues, 6 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Reg3a-201 ENSMUST00000101272.2 825 175aa ENSMUSP00000098829.1 Protein coding CCDS20255 O09037 Q794C6 TSL:1 GENCODE basic APPRIS P1

23.12 kb Forward strand 78.375Mb 78.380Mb 78.385Mb 78.390Mb (Comprehensive set... Reg3b-204 >protein coding Reg3a-201 >protein coding

Reg3b-201 >protein coding

Reg3b-202 >protein coding

Reg3b-203 >retained intron

Contigs < AC113058.6 Genes < Reg3d-202protein coding (Comprehensive set...

< Reg3d-201protein coding

< Reg3d-203protein coding

78.375Mb 78.380Mb 78.385Mb 78.390Mb Reverse strand 23.12 kb

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000101272

3.12 kb Forward strand

Reg3a-201 >protein coding

ENSMUSP00000098... Cleavage site (Sign... Superfamily C-type fold SMART C-type lectin-like Prints PR01504 Pfam C-type lectin-like PROSITE profiles C-type lectin-like PROSITE patterns C-type lectin, conserved site PANTHER PTHR22803:SF130

PTHR22803 Gene3D C-type lectin-like/link domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 175

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8