https://www.alphaknockout.com

Mouse Crb1 Knockout Project (CRISPR/Cas9)

Objective: To create a Crb1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Crb1 (NCBI Reference Sequence: NM_133239 ; Ensembl: ENSMUSG00000063681 ) is located on Mouse 1. 12 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 12 (Transcript: ENSMUST00000059825). Exon 3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygotes for a null allele show focal retinal lesions, loss of adherens junctions between photoreceptors and Muller glia cells, and light-accelerated retinal degeneration. Homozygotes for a spontaneous allele show background- sensitive retinal spotting, photoreceptor dysplasia and degeneration.

Exon 3 starts from about 15.42% of the coding region. Exon 3 covers 4.65% of the coding region. The size of effective KO region: ~196 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 12

Legends Exon of mouse Crb1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(32.15% 643) | C(17.75% 355) | T(29.45% 589) | G(20.65% 413)

Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.45% 589) | C(19.75% 395) | T(34.95% 699) | G(15.85% 317)

Note: The 2000 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 - 139328347 139330346 2000 browser details YourSeq 162 1 179 2000 95.5% chr2 - 87336701 87337246 546 browser details YourSeq 150 1 154 2000 98.8% chrX - 122439700 122439853 154 browser details YourSeq 150 1 154 2000 98.8% chr9 - 87110823 87110976 154 browser details YourSeq 150 1 154 2000 98.8% chr7 - 69875508 69875661 154 browser details YourSeq 150 1 154 2000 98.8% chr4 - 15866957 15867110 154 browser details YourSeq 150 1 154 2000 98.8% chr3 - 16093268 16093421 154 browser details YourSeq 150 1 154 2000 98.8% chr2 - 95707315 95707468 154 browser details YourSeq 150 1 154 2000 98.8% chr2 - 20151477 20151630 154 browser details YourSeq 150 1 154 2000 98.8% chr16 - 40818943 40819096 154 browser details YourSeq 150 1 154 2000 98.8% chr14 - 93488400 93488553 154 browser details YourSeq 150 1 154 2000 98.8% chr12 - 28196183 28196336 154 browser details YourSeq 150 1 154 2000 98.8% chr12 - 28492552 28492705 154 browser details YourSeq 150 1 154 2000 98.8% chr1 - 82178538 82178691 154 browser details YourSeq 150 1 154 2000 98.8% chrX + 150939984 150940137 154 browser details YourSeq 150 1 154 2000 98.8% chr9 + 17403468 17403621 154 browser details YourSeq 150 1 154 2000 98.8% chr9 + 11135713 11135866 154 browser details YourSeq 150 1 154 2000 98.8% chr7 + 22547712 22547865 154 browser details YourSeq 150 1 154 2000 98.8% chr7 + 20558583 20558736 154 browser details YourSeq 150 1 154 2000 98.8% chr6 + 123073111 123073264 154

Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr1 - 139326151 139328150 2000 browser details YourSeq 173 1448 1691 2000 94.5% chr9 - 13079356 13079634 279 browser details YourSeq 153 1507 1720 2000 95.3% chr1 - 125731376 125731593 218 browser details YourSeq 128 1533 1718 2000 94.2% chr13 + 29875092 29875280 189 browser details YourSeq 113 1461 1720 2000 83.4% chr17 + 9201004 9201224 221 browser details YourSeq 107 1541 1691 2000 91.4% chr9 + 11868309 11868471 163 browser details YourSeq 94 1599 1720 2000 94.3% chrX + 164628038 164628160 123 browser details YourSeq 90 1595 1720 2000 91.0% chr14 + 99751767 99751901 135 browser details YourSeq 85 1496 1684 2000 86.5% chr12 - 27317983 27318165 183 browser details YourSeq 84 1606 1720 2000 92.0% chr18 + 72500121 72500240 120 browser details YourSeq 83 1604 1720 2000 92.9% chr10 + 73971264 73971393 130 browser details YourSeq 80 1590 1717 2000 90.8% chr4 - 68175541 68175680 140 browser details YourSeq 80 1616 1720 2000 88.5% chr3 - 76751562 76751664 103 browser details YourSeq 80 1599 1718 2000 90.9% chr15 + 25336829 25336956 128 browser details YourSeq 80 1607 1720 2000 90.7% chr1 + 169845943 169846056 114 browser details YourSeq 78 1170 1720 2000 73.5% chr4 - 111698400 111698508 109 browser details YourSeq 78 1618 1720 2000 84.7% chr10 - 50883874 50883968 95 browser details YourSeq 78 1606 1720 2000 89.2% chr1 - 146936354 146936485 132 browser details YourSeq 77 1607 1720 2000 90.7% chr4 - 71948098 71948222 125 browser details YourSeq 77 1615 1720 2000 85.4% chrX + 65145106 65145202 97

Note: The 2000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Crb1 crumbs family member 1, photoreceptor morphogenesis associated [ Mus musculus (house mouse) ] Gene ID: 170788, updated on 22-Oct-2019

Gene summary

Official Symbol Crb1 provided by MGI Official Full Name crumbs family member 1, photoreceptor morphogenesis associated provided by MGI Primary source MGI:MGI:2136343 See related Ensembl:ENSMUSG00000063681 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 7530426H14Rik; A930008G09Rik Expression Low expression observed in reference dataset See more Orthologs human all

Genomic context

Location: 1 E4; 1 60.87 cM See Crb1 in Genome Data Viewer Exon count: 19

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 1 NC_000067.6 (139176649..139379316, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 1 NC_000067.5 (141094831..141273653, complement)

Chromosome 1 - NC_000067.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Crb1 ENSMUSG00000063681

Description crumbs family member 1, photoreceptor morphogenesis associated [Source:MGI Symbol;Acc:MGI:2136343] Gene Synonyms 7530426H14Rik, A930008G09Rik Location : 139,197,056-139,377,100 reverse strand. GRCm38:CM000994.2 About this gene This gene has 7 transcripts (splice variants), 258 orthologues, is a member of 1 Ensembl protein family and is associated with 19 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Crb1- ENSMUST00000059825.11 5901 1405aa ENSMUSP00000060769.5 Protein coding CCDS15336 Q8VHS2 TSL:1 201 GENCODE basic APPRIS P2

Crb1- ENSMUST00000198445.4 6135 1314aa ENSMUSP00000142552.1 Protein coding - A0A0G2JDY0 TSL:1 204 GENCODE basic APPRIS ALT2

Crb1- ENSMUST00000196402.4 2711 761aa ENSMUSP00000142702.1 Protein coding - Q8VHS2 TSL:1 202 GENCODE basic

Crb1- ENSMUST00000200340.1 757 100aa ENSMUSP00000142909.1 Nonsense mediated - A0A0G2JEU4 CDS 5' 207 decay incomplete TSL:3

Crb1- ENSMUST00000197035.1 636 46aa ENSMUSP00000143386.1 Nonsense mediated - A0A0G2JG15 CDS 5' 203 decay incomplete TSL:3

Crb1- ENSMUST00000199479.1 4033 No - Retained intron - - TSL:NA 206 protein

Crb1- ENSMUST00000199291.1 3671 No - Retained intron - - TSL:1 205 protein

Page 7 of 9 https://www.alphaknockout.com

200.04 kb Forward strand 139.20Mb 139.25Mb 139.30Mb 139.35Mb 4933436E23Rik-201 >lncRNA (Comprehensive set...

Contigs AC138741.7 > AC116810.14 > < AL606536.13

Genes (Comprehensive set... < Crb1-201protein coding

< Crb1-206retained intron < Crb1-204protein coding

< Crb1-205retained intron

< Crb1-202protein coding

< Crb1-207nonsense mediated decay

< Crb1-203nonsense mediated decay

Regulatory Build

139.20Mb 139.25Mb 139.30Mb 139.35Mb Reverse strand 200.04 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000059825

< Crb1-201protein coding

Reverse strand 180.02 kb

ENSMUSP00000060... Transmembrane heli... Low complexity (Seg) Cleavage site (Sign... Superfamily SSF57196

Growth factor receptor cysteine-rich domain superfamily

Concanavalin A-like lectin/glucanase domain superfamily SMART EGF-like domain

EGF-like calcium-binding domain

Laminin G domain Prints PR01983

PR00010 Pfam G domain

EGF-like domain PROSITE profiles Laminin G domain

EGF-like domain PROSITE patterns EGF-like calcium-binding, conserved site

EGF-like, conserved site

EGF-like, conserved site

EGF-type aspartate/asparagine hydroxylation site PANTHER PTHR24049

PTHR24049:SF1 Gene3D 2.10.25.10

2.60.120.200 CDD cd00054

cd00110

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1200 1405

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9