https://www.alphaknockout.com Mouse Nudt21 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Nudt21 conditional knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Nudt21 (NCBI Reference Sequence: NM_026623 ; Ensembl: ENSMUSG00000031754 ) is located on Mouse 8. 7 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 7 (Transcript: ENSMUST00000034204). Exon 2~3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Nudt21 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-65F15 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 17.18% of the coding region. The knockout of Exon 2~3 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 4274 bp, and the size of intron 3 for 3'-loxP site insertion: 2219 bp. The size of effective cKO region: ~1859 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Nudt21 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Summary: Full Length(8359bp) | A(28.62% 2392) | C(18.85% 1576) | T(29.57% 2472) | G(22.96% 1919)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 - 94032753 94035752 3000 browser details YourSeq 218 1653 2555 3000 82.7% chr8 + 106474267 106475143 877 browser details YourSeq 188 1796 2151 3000 85.7% chr14 - 67089046 67089452 407 browser details YourSeq 182 1767 2151 3000 87.2% chr14 - 34694115 34694545 431 browser details YourSeq 179 1646 2150 3000 82.2% chr3 + 26460001 26460685 685 browser details YourSeq 170 1664 2104 3000 85.8% chr14 + 61510289 61510795 507 browser details YourSeq 169 1796 2159 3000 82.6% chr7 - 30682673 30683061 389 browser details YourSeq 168 1696 2151 3000 83.1% chr1 - 45844914 45845441 528 browser details YourSeq 164 1742 2150 3000 82.2% chr6 - 119829818 119830317 500 browser details YourSeq 163 1796 2157 3000 83.0% chr1 - 93660141 93660589 449 browser details YourSeq 163 1803 2503 3000 79.1% chr9 + 110020545 110021190 646 browser details YourSeq 163 1767 2151 3000 87.6% chr9 + 66323250 66323701 452 browser details YourSeq 152 1796 2151 3000 87.6% chr2 - 68212363 68212749 387 browser details YourSeq 150 1878 2156 3000 82.7% chr7 + 100381122 100792414 411293 browser details YourSeq 149 1696 2146 3000 84.4% chr17 - 88290688 88291101 414 browser details YourSeq 141 1796 2148 3000 81.6% chr5 - 23293426 23293813 388 browser details YourSeq 137 1878 2151 3000 84.5% chr6 - 59840828 59847173 6346 browser details YourSeq 136 1792 2064 3000 84.0% chr1 - 35337636 35338014 379 browser details YourSeq 136 1767 2132 3000 85.4% chr5 + 32202973 32203390 418 browser details YourSeq 134 1796 2147 3000 86.5% chr5 - 60177060 60177462 403

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 - 94027894 94030893 3000 browser details YourSeq 127 2279 2731 3000 84.7% chr6 - 52063051 52063529 479 browser details YourSeq 111 2289 2707 3000 74.1% chr8 - 69617021 69617196 176 browser details YourSeq 111 2292 2705 3000 84.0% chr11 + 77668119 77774007 105889 browser details YourSeq 110 2281 2709 3000 72.9% chr11 - 80068831 80069019 189 browser details YourSeq 108 2286 2706 3000 74.3% chr1 - 180414721 180414907 187 browser details YourSeq 107 2272 2710 3000 75.0% chr16 + 31699807 31700000 194 browser details YourSeq 106 2280 2708 3000 90.9% chr7 - 79562124 79703714 141591 browser details YourSeq 105 2294 2707 3000 73.7% chr2 - 158404014 158404181 168 browser details YourSeq 105 2399 2706 3000 87.9% chr9 + 21454010 21454317 308 browser details YourSeq 103 2393 2649 3000 92.7% chr1 + 156463136 156749303 286168 browser details YourSeq 101 2313 2708 3000 76.2% chr9 + 71705602 71705764 163 browser details YourSeq 101 2496 2687 3000 85.0% chr14 + 7828278 7828656 379 browser details YourSeq 100 2504 2710 3000 91.6% chr8 + 111311117 111311530 414 browser details YourSeq 99 2292 2707 3000 77.9% chrX - 102491439 102491618 180 browser details YourSeq 97 2280 2709 3000 75.4% chr3 - 108342538 108342738 201 browser details YourSeq 97 2540 2709 3000 82.1% chr1 + 86050528 86050689 162 browser details YourSeq 96 2286 2674 3000 83.0% chr4 + 148539269 148539638 370 browser details YourSeq 95 2272 2707 3000 74.1% chr19 - 37742786 37742989 204 browser details YourSeq 93 2351 2708 3000 74.4% chr11 + 103978932 103979061 130

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com Gene and protein information: Nudt21 nudix (nucleoside diphosphate linked moiety X)-type motif 21 [ Mus musculus (house mouse) ] Gene ID: 68219, updated on 10-Oct-2019

Gene summary

Official Symbol Nudt21 provided by MGI Official Full Name nudix (nucleoside diphosphate linked moiety X)-type motif 21 provided by MGI Primary source MGI:MGI:1915469 See related Ensembl:ENSMUSG00000031754 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 25kDa; Cpsf5; AU014860; AW549947; 3110048P04Rik; 5730530J16Rik Expression Ubiquitous expression in adrenal adult (RPKM 99.8), ovary adult (RPKM 79.7) and 28 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 C5 See Nudt21 in Genome Data Viewer

Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (94019403..94037039, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (96543303..96560939, complement)

Chromosome 8 - NC_000074.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Nudt21 ENSMUSG00000031754

Description nudix (nucleoside diphosphate linked moiety X)-type motif 21 [Source:MGI Symbol;Acc:MGI:1915469] Gene Synonyms 25kDa, 3110048P04Rik, 5730530J16Rik, Cpsf5 Location Chromosome 8: 94,015,496-94,037,031 reverse strand. GRCm38:CM001001.2 About this gene This gene has 4 transcripts (splice variants), 218 orthologues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Nudt21-201 ENSMUST00000034204.10 5010 227aa ENSMUSP00000034204.9 Protein coding CCDS40433 Q9CQF3 TSL:1 GENCODE basic APPRIS P1

Nudt21-204 ENSMUST00000212981.1 1184 218aa ENSMUSP00000148597.1 Protein coding - A0A1D5RM23 TSL:3 GENCODE basic

Nudt21-202 ENSMUST00000212622.1 664 169aa ENSMUSP00000148485.1 Protein coding - A0A1D5RLS2 CDS 5' incomplete TSL:3

Nudt21-203 ENSMUST00000212911.1 470 129aa ENSMUSP00000148500.1 Protein coding - A0A1D5RLT7 CDS 5' incomplete TSL:3

Page 6 of 8 https://www.alphaknockout.com

41.54 kb Forward strand

94.01Mb 94.02Mb 94.03Mb 94.04Mb Ogfod1-203 >protein coding (Comprehensive set...

Ogfod1-202 >protein coding

Ogfod1-205 >retained intron

Ogfod1-204 >lncRNA

Ogfod1-201 >protein coding

Ogfod1-207 >lncRNA

Contigs < AC138118.4

Genes (Comprehensive set... < Amfr-201protein coding < Nudt21-201protein coding

< Amfr-204protein coding < Nudt21-202protein coding

< Nudt21-204protein coding

< Nudt21-203protein coding

Regulatory Build

94.01Mb 94.02Mb 94.03Mb 94.04Mb Reverse strand 41.54 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000034204

< Nudt21-201protein coding

Reverse strand 21.54 kb

ENSMUSP00000034... Superfamily NUDIX hydrolase-like domain superfamily Pfam Cleavage/polyadenylation specificity factor subunit 5 PROSITE profiles NUDIX hydrolase domain PIRSF Cleavage/polyadenylation specificity factor subunit 5 PANTHER PTHR13047:SF0

Cleavage/polyadenylation specificity factor subunit 5 Gene3D 3.90.79.10

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 200 227

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8