http://beta.alphaknockout.cyagen.net

Mouse Nprl2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Nprl2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary:

The Nprl2 (NCBI Reference Sequence: NM_018879 ; Ensembl: ENSMUSG00000010057 ) is located on Mouse 9. 11 exons are identified , with the ATG start codon in exon 1 and the TGA stop codon in exon 11 (Transcript: ENSMUST00000010201). Exon 5~11 will be selected as conditional knockout region (cKO region). The second loxP will be inserted downstream of the TGA stop codon. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-95A11 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit reduced embryo size, microphthalmia, occaional anophthalmia, pale liver, reduced fetal liver hematopoiesis, impaired erythropoiesis and reduced methionine synthesis.

Exon 5~11 is not frameshift exon, and covers 60.7% of the coding region. The size of intron 4 for 5'-loxP site insertion: 535 bp.

The size of effective cKO region: ~1690 bp. The function of mouse Cyb561d2 and Zmynd10 may be affected by deleting this cKO region

Page 1 of 8 http://beta.alphaknockout.cyagen.net

Overview of the Targeting Strategy

gRNA region

Wildtype allele T gRNA region G 5' A 3'

1 1 2 3 4 5 6 7 8 9 10 11 1 2

Targeting vector T G A

Targeted allele T G A

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Cyb561d2 Homology arm Exon of mouse Nprl2 cKO region

Exon of mouse Zmynd10 loxP site

Page 2 of 8 http://beta.alphaknockout.cyagen.net

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8056bp) | A(24.57% 1979) | C(25.26% 2035) | G(28.45% 2292) | T(21.72% 1750)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 http://beta.alphaknockout.cyagen.net

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr9 + 107540902 107543901 3000 browser details YourSeq 35 2957 3000 3000 80.0% chr15 + 78862736 78862775 40 browser details YourSeq 32 2968 3000 3000 100.0% chr7 - 79670356 79670389 34 browser details YourSeq 32 76 305 3000 97.1% chr1 - 21241044 21241476 433 browser details YourSeq 30 2969 3000 3000 96.9% chr17 - 6891256 6891287 32 browser details YourSeq 29 2972 3000 3000 100.0% chr7 + 44388945 44388973 29 browser details YourSeq 28 2973 3000 3000 100.0% chr8 + 122563433 122563460 28 browser details YourSeq 28 2947 3000 3000 96.7% chr2 + 112452186 112452240 55 browser details YourSeq 27 2583 2622 3000 88.6% chr5 - 53364061 53364104 44 browser details YourSeq 27 2974 3000 3000 100.0% chr10 + 60248056 60248082 27 browser details YourSeq 26 2216 2246 3000 82.8% chr4 - 22479363 22479391 29 browser details YourSeq 23 957 979 3000 100.0% chr2 - 145935031 145935053 23 browser details YourSeq 23 2978 3000 3000 100.0% chr1 - 16154214 16154236 23 browser details YourSeq 23 2168 2190 3000 100.0% chr3 + 110311385 110311407 23 browser details YourSeq 22 2058 2079 3000 100.0% chr7 + 82585146 82585167 22 browser details YourSeq 21 2216 2236 3000 100.0% chr11 - 87490071 87490091 21 browser details YourSeq 21 2216 2236 3000 100.0% chr11 - 77755891 77755911 21

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr9 + 107545708 107548707 3000 browser details YourSeq 95 116 290 3000 82.3% chr10 + 40996463 40996634 172 browser details YourSeq 88 171 318 3000 85.6% chr7 - 101803778 101803926 149 browser details YourSeq 88 115 298 3000 81.3% chr11 - 119389308 119389468 161 browser details YourSeq 85 114 291 3000 86.0% chr1 - 10747475 10747651 177 browser details YourSeq 83 115 291 3000 81.2% chr3 - 138888516 138888662 147 browser details YourSeq 80 195 365 3000 88.6% chr11 + 100299757 100300106 350 browser details YourSeq 79 171 291 3000 88.5% chrX + 12841761 12841881 121 browser details YourSeq 78 129 293 3000 81.1% chr13 - 55731164 55731301 138 browser details YourSeq 78 171 299 3000 81.5% chr12 - 71103258 71103386 129 browser details YourSeq 76 171 291 3000 86.8% chr14 - 39793487 39793607 121 browser details YourSeq 74 115 290 3000 81.0% chr12 - 83669922 83670053 132 browser details YourSeq 74 130 291 3000 90.6% chr10 - 63242386 63242553 168 browser details YourSeq 71 186 363 3000 77.3% chr10 + 80885356 80885473 118 browser details YourSeq 70 177 315 3000 81.8% chr12 + 4657690 4657826 137 browser details YourSeq 69 195 293 3000 93.9% chr13 - 98281750 98281848 99 browser details YourSeq 68 195 293 3000 91.6% chr12 - 43474998 43475096 99 browser details YourSeq 68 197 291 3000 93.7% chr8 + 85103786 85103880 95 browser details YourSeq 67 195 299 3000 89.5% chr12 + 89409575 89409679 105 browser details YourSeq 67 186 292 3000 88.3% chr10 + 59538000 59538105 106

Note: The 3000 bp section downstream of Exon 11 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 http://beta.alphaknockout.cyagen.net Gene and information: Nprl2 NPR2 like, GATOR1 complex subunit [ Mus musculus (house mouse) ] Gene ID: 56032, updated on 12-Aug-2019

Gene summary

Official Symbol Nprl2 provided by MGI Official Full Name NPR2 like, GATOR1 complex subunit provided by MGI Primary source MGI:MGI:1914482 See related Ensembl:ENSMUSG00000010057 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as G21; NPR2L; Tusc4; 2810446G01Rik Expression Ubiquitous expression in CNS E18 (RPKM 22.0), whole brain E14.5 (RPKM 21.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 9; 9 F1 See Nprl2 in Genome Data Viewer

Exon count: 11

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 9 NC_000075.6 (107542177..107545706)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 9 NC_000075.5 (107444540..107448037)

Chromosome 9 - NC_000075.6

Page 5 of 8 http://beta.alphaknockout.cyagen.net

Transcript information: This gene has 5 transcripts

Gene: Nprl2 ENSMUSG00000010057

Description NPR2 like, GATOR1 complex subunit [Source:MGI Symbol;Acc:MGI:1914482] Gene Synonyms 2810446G01Rik, G21, NPR2L, NPRL2, Tusc4 Location Chromosome 9: 107,542,226-107,545,706 forward strand. GRCm38:CM001002.2 About this gene This gene has 5 transcripts (splice variants), 203 orthologues, is a member of 1 Ensembl protein family and is associated with 9 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Nprl2- ENSMUST00000010201.8 1441 380aa ENSMUSP00000010201.3 Protein coding CCDS23492 Q9WUE4 TSL:1 201 GENCODE basic APPRIS P1

Nprl2- ENSMUST00000195370.5 829 163aa ENSMUSP00000141746.1 Nonsense mediated - A0A0A6YWX8 TSL:3 205 decay

Nprl2- ENSMUST00000193628.5 1736 No - Retained intron - - TSL:2 203 protein

Nprl2- ENSMUST00000194848.1 637 No - Retained intron - - TSL:2 204 protein

Nprl2- ENSMUST00000192951.1 391 No - Retained intron - - TSL:3 202 protein

Page 6 of 8 http://beta.alphaknockout.cyagen.net

23.48 kb Forward strand 107.535Mb 107.540Mb 107.545Mb 107.550Mb 107.555Mb (Comprehensive set... Tmem115-201 >protein coding Nprl2-201 >protein coding Zmynd10-201 >protein coding Rassf1-210 >processed transcript

Nprl2-205 >nonsense mediated decay Zmynd10-205 >processed transcript Rassf1-201 >protein coding

Nprl2-203 >retained intron Zmynd10-203 >nonsense mediated decay

Nprl2-204 >retained intron Zmynd10-204 >retained intron

Nprl2-202 >retained intron Zmynd10-202 >retained intron

Rassf1-209 >nonsense mediated decay

Rassf1-202 >protein coding

Rassf1-205 >retained intron

Rassf1-204 >protein coding

Contigs AL672219.7 >

Genes < Cyb561d2-201protein coding < Gm34106-201antisense (Comprehensive set...

< Cyb561d2-205protein coding

< Cyb561d2-202protein coding

< Cyb561d2-203processed transcript

< Cyb561d2-204nonsense mediated decay

Regulatory Build

107.535Mb 107.540Mb 107.545Mb 107.550Mb 107.555Mb Reverse strand 23.48 kb

Regulation Legend CTCF Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 7 of 8 http://beta.alphaknockout.cyagen.net

Transcript: ENSMUST00000010201

3.48 kb Forward strand

Nprl2-201 >protein coding

ENSMUSP00000010... Pfam Nitrogen permease regulator 2 PANTHER Nitrogen permease regulator 2

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 380

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

Page 8 of 8