https://www.alphaknockout.com

Mouse Cryz Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Cryz conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cryz (NCBI Reference Sequence: NM_009968 ; Ensembl: ENSMUSG00000028199 ) is located on Mouse 3. 9 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 9 (Transcript: ENSMUST00000029850). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cryz gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-5D19 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 43.2% of the coding region. The knockout of Exon 5 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 2248 bp, and the size of intron 5 for 3'-loxP site insertion: 4586 bp. The size of effective cKO region: ~552 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 4 5 9 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Cryz Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7052bp) | A(25.6% 1805) | C(22.79% 1607) | T(27.01% 1905) | G(24.6% 1735)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 + 154610573 154613572 3000 browser details YourSeq 131 1298 1688 3000 90.7% chr9 - 110516420 110517002 583 browser details YourSeq 130 1303 2023 3000 80.8% chr6 - 5426916 5427473 558 browser details YourSeq 124 832 1004 3000 86.4% chr14 - 118375810 118375986 177 browser details YourSeq 120 1244 1594 3000 83.1% chr9 - 120440605 120441033 429 browser details YourSeq 117 1217 1756 3000 76.0% chr3 - 55619234 55619582 349 browser details YourSeq 117 1240 1758 3000 79.9% chr9 + 21861267 21861643 377 browser details YourSeq 115 1219 1547 3000 85.5% chrX + 103660728 103661049 322 browser details YourSeq 112 1225 1530 3000 86.5% chr2 + 91858579 91858883 305 browser details YourSeq 110 1329 1757 3000 78.5% chr8 - 83648160 83648428 269 browser details YourSeq 110 1241 1575 3000 85.6% chr16 - 22360603 22360957 355 browser details YourSeq 108 1226 1506 3000 91.0% chr14 + 32159046 32159329 284 browser details YourSeq 106 1240 1760 3000 77.7% chr14 + 19863417 19863793 377 browser details YourSeq 101 1330 1685 3000 87.3% chr14 - 62735543 62735962 420 browser details YourSeq 100 1233 1763 3000 76.8% chr12 - 109392181 109392531 351 browser details YourSeq 96 1295 1758 3000 75.6% chr18 + 60499293 60499614 322 browser details YourSeq 95 1242 1553 3000 85.0% chr15 - 39960966 39961297 332 browser details YourSeq 94 1240 1758 3000 75.5% chr11 + 104285630 104285955 326 browser details YourSeq 91 1227 1758 3000 77.8% chr5 - 91985915 91986281 367 browser details YourSeq 90 1295 1758 3000 79.1% chr17 - 13127706 13128047 342

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 + 154614125 154617124 3000 browser details YourSeq 73 2804 2937 3000 82.2% chr2 + 32903063 32903173 111 browser details YourSeq 60 2849 2950 3000 93.0% chr1 + 181464501 181812351 347851 browser details YourSeq 52 2800 2930 3000 73.4% chr7 + 144724644 144724721 78 browser details YourSeq 50 2859 2935 3000 94.8% chr7 + 130888656 130888921 266 browser details YourSeq 46 2801 2875 3000 96.1% chr19 - 6431673 6431938 266 browser details YourSeq 46 2772 2878 3000 92.8% chr15 - 99981012 99981265 254 browser details YourSeq 46 2901 2952 3000 96.2% chr14 + 49715896 49715981 86 browser details YourSeq 42 2801 2879 3000 93.9% chr2 - 155305467 155305887 421 browser details YourSeq 40 2807 2928 3000 93.5% chr13 + 44610598 44610720 123 browser details YourSeq 40 2900 2952 3000 95.6% chr11 + 98555668 98555723 56 browser details YourSeq 36 2196 2515 3000 56.5% chr6 + 138461255 138461335 81 browser details YourSeq 32 2850 2884 3000 97.2% chr10 + 59790131 59790166 36 browser details YourSeq 32 2903 2937 3000 97.2% chr10 + 59790131 59790166 36 browser details YourSeq 31 2794 2831 3000 94.5% chr18 - 10735166 10735203 38 browser details YourSeq 31 2902 2937 3000 94.3% chr1 + 181464501 181464536 36 browser details YourSeq 30 2849 2879 3000 100.0% chr3 - 156757540 156757571 32 browser details YourSeq 30 2772 2826 3000 96.9% chr11 - 107713428 107713482 55 browser details YourSeq 30 2853 2884 3000 100.0% chr17 + 36758317 36758377 61 browser details YourSeq 30 2860 2930 3000 64.8% chr13 + 44610653 44610696 44

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and protein information: Cryz , zeta [ Mus musculus (house mouse) ] Gene ID: 12972, updated on 24-Oct-2019

Gene summary

Official Symbol Cryz provided by MGI Official Full Name crystallin, zeta provided by MGI Primary source MGI:MGI:88527 See related Ensembl:ENSMUSG00000028199 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Sez9 Expression Biased expression in kidney adult (RPKM 72.1), placenta adult (RPKM 16.5) and 8 other tissuesS ee more Orthologs human all

Genomic context

Location: 3; 3 H4 See Cryz in Genome Data Viewer

Exon count: 9

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (154596711..154623182)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (154259976..154286146)

Chromosome 3 - NC_000069.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 11 transcripts

Gene: Cryz ENSMUSG00000028199

Description crystallin, zeta [Source:MGI Symbol;Acc:MGI:88527] Gene Synonyms Sez9, quinone reductase Location Chromosome 3: 154,596,711-154,623,182 forward strand. GRCm38:CM000996.2 About this gene This gene has 11 transcripts (splice variants), 230 orthologues, 16 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cryz- ENSMUST00000029850.14 2507 331aa ENSMUSP00000029850.8 Protein coding CCDS17927 P47199 TSL:1 201 GENCODE basic APPRIS P1

Cryz- ENSMUST00000192462.5 2505 331aa ENSMUSP00000142105.1 Protein coding CCDS17927 P47199 TSL:1 208 GENCODE basic APPRIS P1

Cryz- ENSMUST00000194876.5 1059 301aa ENSMUSP00000142101.1 Protein coding - A0A0A6YXR4 CDS 3' 209 incomplete TSL:1

Cryz- ENSMUST00000155385.7 755 137aa ENSMUSP00000122619.1 Protein coding - D3Z4Q4 CDS 3' 206 incomplete TSL:5

Cryz- ENSMUST00000155232.1 641 173aa ENSMUSP00000118449.1 Protein coding - D3YUG9 CDS 3' 205 incomplete TSL:3

Cryz- ENSMUST00000144764.7 581 152aa ENSMUSP00000121269.1 Protein coding - D3Z2X0 CDS 3' 204 incomplete TSL:3

Cryz- ENSMUST00000195103.1 513 171aa ENSMUSP00000141246.1 Protein coding - A0A0A6YVS7 CDS 5' and 3' 210 incomplete TSL:5

Cryz- ENSMUST00000140644.7 473 97aa ENSMUSP00000115146.1 Protein coding - D3YWU6 CDS 3' 203 incomplete TSL:3

Cryz- ENSMUST00000184537.7 2606 223aa ENSMUSP00000139387.1 Nonsense mediated - V9GXY8 TSL:1 207 decay

Cryz- ENSMUST00000135723.1 795 56aa ENSMUSP00000143311.1 Nonsense mediated - A0A0G2JFU5 TSL:5 202 decay

Cryz- ENSMUST00000195292.1 3476 No - Retained intron - - TSL:NA 211 protein

Page 6 of 8 https://www.alphaknockout.com

46.47 kb Forward strand 154.59Mb 154.60Mb 154.61Mb 154.62Mb 154.63Mb Cryz-208 >protein coding (Comprehensive set...

Cryz-207 >nonsense mediated decay

Cryz-201 >protein coding

Cryz-209 >protein coding

Cryz-206 >protein coding Cryz-211 >retained intron

Cryz-202 >nonsense mediated decay

Cryz-203 >protein coding

Cryz-204 >protein coding

Cryz-205 >protein coding

Cryz-210 >protein coding

Contigs < AC164292.4 < AC107667.12 Genes < Tyw3-201protein coding (Comprehensive set...

< Tyw3-204protein coding

< Tyw3-203lncRNA

< Tyw3-202protein coding

Regulatory Build

154.59Mb 154.60Mb 154.61Mb 154.62Mb 154.63Mb Reverse strand 46.47 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000029850

26.47 kb Forward strand

Cryz-201 >protein coding

ENSMUSP00000029... Low complexity (Seg) Superfamily NAD(P)-binding domain superfamily

GroES-like superfamily SMART Polyketide synthase, enoylreductase domain Pfam , N-terminal Alcohol dehydrogenase, C-terminal

PROSITE patterns Quinone oxidoreductase/zeta-crystallin, conserved site PANTHER PTHR44154 Gene3D 3.40.50.720

3.90.180.10 CDD cd08253

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 331

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8