https://www.alphaknockout.com

Mouse Nrl Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Nrl conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Nrl (NCBI Reference Sequence: NM_001271916 ; Ensembl: ENSMUSG00000040632 ) is located on Mouse 14. 4 exons are identified, with the ATG start codon in exon 3 and the TGA stop codon in exon 4 (Transcript: ENSMUST00000062232). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Nrl gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-158G18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygotes for a targeted null mutation exhibit a retinal defect causing loss of rod function, exaggerated cone function, short, sparse outer segments, and abnormal disks.

Exon 3 is not frameshift exon, and covers 53.59% of the coding region. The size of intron 2 for 5'-loxP site insertion: 1592 bp, and the size of intron 3 for 3'-loxP site insertion: 1200 bp. The size of effective cKO region: ~881 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 4 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Nrl Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7381bp) | A(23.75% 1753) | C(23.02% 1699) | T(25.71% 1898) | G(27.52% 2031)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 - 55522719 55525718 3000 browser details YourSeq 155 1597 1858 3000 95.4% chr6 - 146016516 146016883 368 browser details YourSeq 150 1325 1874 3000 83.0% chr1 + 15844811 15844982 172 browser details YourSeq 148 1324 1851 3000 84.9% chr6 - 4646726 4646887 162 browser details YourSeq 146 1703 1878 3000 91.0% chr7 - 43427890 43428060 171 browser details YourSeq 144 1704 1858 3000 96.8% chr4 - 149567239 149567477 239 browser details YourSeq 144 1705 1859 3000 96.8% chr2 + 167334930 167335091 162 browser details YourSeq 144 1325 1859 3000 81.7% chr1 + 139265581 139265751 171 browser details YourSeq 143 1705 1859 3000 96.8% chr9 - 103196850 103197026 177 browser details YourSeq 143 1707 1858 3000 97.4% chr9 - 56803956 56804109 154 browser details YourSeq 143 1707 1854 3000 98.7% chr9 - 14234411 14234560 150 browser details YourSeq 143 1707 1860 3000 96.8% chr14 - 50808916 50809071 156 browser details YourSeq 143 1707 1859 3000 97.4% chr11 + 114927782 114927936 155 browser details YourSeq 142 1705 1859 3000 96.2% chrX - 83032980 83033139 160 browser details YourSeq 142 1707 1859 3000 96.8% chr5 - 135954720 135954874 155 browser details YourSeq 142 1707 1859 3000 96.8% chr5 - 130160906 130161061 156 browser details YourSeq 142 1704 1858 3000 96.2% chr15 - 46184986 46185148 163 browser details YourSeq 142 1707 1859 3000 96.8% chr14 - 11292775 11292929 155 browser details YourSeq 142 1708 1860 3000 96.8% chr11 - 33484860 33485014 155 browser details YourSeq 142 1707 1859 3000 96.8% chr1 - 33744802 33744956 155

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 - 55518838 55521837 3000 browser details YourSeq 122 4 1656 3000 97.7% chr1 + 37914872 38269301 354430 browser details YourSeq 75 3 172 3000 96.3% chr11 + 53388240 53388516 277 browser details YourSeq 63 4 172 3000 91.0% chr1 - 160722466 160737694 15229 browser details YourSeq 63 3 539 3000 69.6% chr1 - 33177062 33177131 70 browser details YourSeq 58 1579 1648 3000 95.8% chr10 - 94598029 94598122 94 browser details YourSeq 56 3 60 3000 98.3% chr1 - 177766241 177766298 58 browser details YourSeq 56 2 64 3000 95.3% chr11 + 72860095 72860158 64 browser details YourSeq 55 3 59 3000 98.3% chr5 - 122616631 122616687 57 browser details YourSeq 55 2 58 3000 98.3% chr15 - 80169241 80169297 57 browser details YourSeq 55 2 62 3000 95.1% chr1 - 89008841 89008901 61 browser details YourSeq 55 3 59 3000 98.3% chr11 + 115133220 115133276 57 browser details YourSeq 54 4 59 3000 98.3% chr7 + 122362107 122362162 56 browser details YourSeq 54 1 59 3000 96.7% chr11 + 97787326 98030841 243516 browser details YourSeq 54 2 59 3000 96.6% chr11 + 88046655 88046712 58 browser details YourSeq 54 3 60 3000 96.6% chr11 + 31829321 31829378 58 browser details YourSeq 54 2 61 3000 95.0% chr1 + 75212355 75212414 60 browser details YourSeq 53 3 59 3000 96.5% chr1 - 72647255 72647311 57 browser details YourSeq 53 3 59 3000 96.5% chr4 + 135475876 135475932 57 browser details YourSeq 53 3 59 3000 96.5% chr11 + 70449015 70449071 57

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Nrl neural retina leucine zipper gene [ Mus musculus (house mouse) ] Gene ID: 18185, updated on 1-Oct-2019

Gene summary

Official Symbol Nrl provided by MGI Official Full Name neural retina leucine zipper gene provided by MGI Primary source MGI:MGI:102567 See related Ensembl:ENSMUSG00000040632 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as D14H14S46E Summary This gene encodes a member of the basic leucine zipper domain family of transcription factors. The encoded protein is Expression preferentially expressed in the retina and is necessary for rod photoreceptor development. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Dec 2012] Orthologs Low expression observed in reference dataset See more human all

Genomic context

Location: 14 C3; 14 28.19 cM See Nrl in Genome Data Viewer

Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 14 NC_000080.6 (55518978..55524981, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 14 NC_000080.5 (56137815..56143802, complement)

Chromosome 14 - NC_000080.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Nrl ENSMUSG00000040632

Description neural retina leucine zipper gene [Source:MGI Symbol;Acc:MGI:102567] Location : 55,518,978-55,524,981 reverse strand. GRCm38:CM001007.2 About this gene This gene has 7 transcripts (splice variants), 159 orthologues, 6 paralogues, is a member of 1 Ensembl protein family and is associated with 16 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Nrl-201 ENSMUST00000062232.14 2504 237aa ENSMUSP00000054457.7 Protein coding CCDS27113 P54846 Q543Y0 TSL:1 GENCODE basic APPRIS P1

Nrl-202 ENSMUST00000111404.7 2422 237aa ENSMUSP00000107035.1 Protein coding CCDS27113 P54846 Q543Y0 TSL:1 GENCODE basic APPRIS P1

Nrl-203 ENSMUST00000178694.2 1928 237aa ENSMUSP00000136445.1 Protein coding CCDS27113 P54846 Q543Y0 TSL:1 GENCODE basic APPRIS P1

Nrl-205 ENSMUST00000228287.1 742 226aa ENSMUSP00000153933.1 Protein coding - A0A2I3BPV3 CDS 3' incomplete

Nrl-207 ENSMUST00000228902.1 733 62aa ENSMUSP00000154322.1 Protein coding - A0A2I3BQU2 CDS 3' incomplete

Nrl-206 ENSMUST00000228351.1 4973 No protein - Retained intron - - -

Nrl-204 ENSMUST00000226858.1 731 No protein - lncRNA - - -

Page 6 of 8 https://www.alphaknockout.com

26.00 kb Forward strand 55.51Mb 55.52Mb 55.53Mb Cpne6-203 >protein coding (Comprehensive set...

Cpne6-201 >protein coding

Cpne6-206 >retained intron

Cpne6-204 >nonsense mediated decay

Cpne6-205 >retained intron

Cpne6-202 >protein coding

Cpne6-207 >protein coding

Contigs < AC159002.2 Genes (Comprehensive set... < Nrl-202protein coding < Gm10876-201lncRNA

< Nrl-201protein coding

< Nrl-203protein coding

< Nrl-206retained intron

< Nrl-204lncRNA

< Nrl-205protein coding

< Nrl-207protein coding

Regulatory Build

55.51Mb 55.52Mb 55.53Mb Reverse strand 26.00 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000062232

< Nrl-201protein coding

Reverse strand 6.00 kb

ENSMUSP00000054... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Transcription factor, Skn-1-like, DNA-binding domain superfamily

SSF57959 SMART Basic-leucine zipper domain Pfam Maf transcription factor, N-terminal Basic leucine zipper domain, Maf-type

PROSITE profiles Basic-leucine zipper domain PANTHER Neural retina-specific leucine zipper protein

Transcription factor Maf family Gene3D 1.20.5.170 CDD cd14718

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 200 237

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8