https://www.alphaknockout.com

Mouse Nt5m Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Nt5m conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Nt5m (NCBI Reference Sequence: NM_134029 ; Ensembl: ENSMUSG00000032615 ) is located on Mouse 11. 5 are identified, with the ATG start codon in 1 and the TGA stop codon in exon 5 (Transcript: ENSMUST00000102695). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Nt5m gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-330D24 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 36.97% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of 1 for 5'-loxP site insertion: 4418 bp, and the size of intron 2 for 3'-loxP site insertion: 7858 bp. The size of effective cKO region: ~601 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 5 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Nt5m Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7101bp) | A(23.59% 1675) | C(22.81% 1620) | T(30.28% 2150) | G(23.32% 1656)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 + 59849542 59852541 3000 browser details YourSeq 88 853 1047 3000 88.1% chr10 + 82300828 82301141 314 browser details YourSeq 75 957 1047 3000 92.4% chr11 + 83911501 83911609 109 browser details YourSeq 74 961 1050 3000 92.1% chr19 + 46112614 46112720 107 browser details YourSeq 73 959 1051 3000 91.2% chr2 - 157384232 157384336 105 browser details YourSeq 73 957 1051 3000 89.4% chr19 + 37061766 37061878 113 browser details YourSeq 73 865 1050 3000 85.4% chr19 + 34978557 34978739 183 browser details YourSeq 67 970 1049 3000 92.5% chr1 + 153313636 153313716 81 browser details YourSeq 66 964 1047 3000 90.5% chrX + 151826163 151826260 98 browser details YourSeq 66 965 1051 3000 91.3% chr12 + 51950436 51950523 88 browser details YourSeq 65 960 1050 3000 90.4% chr4 - 40162636 40162745 110 browser details YourSeq 65 964 1050 3000 88.6% chr5 + 121539600 121539704 105 browser details YourSeq 64 961 1050 3000 88.3% chr11 + 3474152 3474259 108 browser details YourSeq 63 960 1051 3000 91.0% chr1 - 176477062 176477171 110 browser details YourSeq 63 973 1050 3000 91.0% chr6 + 107804954 107805032 79 browser details YourSeq 63 970 1052 3000 91.1% chr1 + 181671968 181672052 85 browser details YourSeq 62 970 1046 3000 91.0% chr10 - 37196524 37196601 78 browser details YourSeq 62 959 1049 3000 90.8% chr14 + 118329980 118330088 109 browser details YourSeq 61 959 1029 3000 94.4% chr9 + 70057119 70057207 89 browser details YourSeq 60 961 1043 3000 94.2% chr5 - 143162126 143162226 101

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 + 59853143 59856142 3000 browser details YourSeq 129 2649 2794 3000 94.6% chr4 - 15458245 15458394 150 browser details YourSeq 108 2660 2782 3000 94.4% chr11 - 65541218 65541375 158 browser details YourSeq 99 1078 1237 3000 86.3% chr5 - 5467053 5467213 161 browser details YourSeq 95 1073 1203 3000 86.3% chrX - 162422934 162423064 131 browser details YourSeq 92 968 1127 3000 89.7% chr4 - 147924390 147924795 406 browser details YourSeq 64 968 1031 3000 100.0% chr7 - 56212517 56212580 64 browser details YourSeq 63 979 1108 3000 98.5% chr7 - 36999127 36999484 358 browser details YourSeq 63 1095 1212 3000 88.0% chr1 - 135696830 135696945 116 browser details YourSeq 62 968 1029 3000 100.0% chr9 - 48217621 48217682 62 browser details YourSeq 62 968 1029 3000 100.0% chr9 - 44350835 44350896 62 browser details YourSeq 62 968 1029 3000 100.0% chr8 - 71960800 71960861 62 browser details YourSeq 62 968 1029 3000 100.0% chr8 - 64273858 64273919 62 browser details YourSeq 62 968 1029 3000 100.0% chr8 - 40363828 40363889 62 browser details YourSeq 62 968 1029 3000 100.0% chr6 - 80549120 80549181 62 browser details YourSeq 62 968 1029 3000 100.0% chr6 - 54881960 54882021 62 browser details YourSeq 62 968 1029 3000 100.0% chr6 - 32110369 32110430 62 browser details YourSeq 62 968 1029 3000 100.0% chr3 - 58448713 58448774 62 browser details YourSeq 62 968 1029 3000 100.0% chr3 - 48998482 48998543 62 browser details YourSeq 62 968 1029 3000 100.0% chr2 - 83262455 83262516 62

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and protein information: Nt5m 5',3'-, mitochondrial [ Mus musculus (house mouse) ] Gene ID: 103850, updated on 12-Aug-2019

Gene summary

Official Symbol Nt5m provided by MGI Official Full Name 5',3'-nucleotidase, mitochondrial provided by MGI Primary source MGI:MGI:1917127 See related Ensembl:ENSMUSG00000032615 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as dNT-2; AI846937; 2010013E09Rik Expression Ubiquitous expression in testis adult (RPKM 20.3), adrenal adult (RPKM 18.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 11; 11 B1.3 See Nt5m in Genome Data Viewer

Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (59847947..59876533)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (59661575..59690035)

Chromosome 11 - NC_000077.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Nt5m ENSMUSG00000032615

Description 5',3'-nucleotidase, mitochondrial [Source:MGI Symbol;Acc:MGI:1917127] Gene Synonyms 2010013E09Rik, dNT-2 Location Chromosome 11: 59,839,447-59,880,968 forward strand. GRCm38:CM001004.2 About this gene This gene has 4 transcripts (splice variants), 182 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Nt5m-201 ENSMUST00000102695.3 1306 220aa ENSMUSP00000099756.3 Protein coding CCDS24779 Q8VCE6 TSL:1 GENCODE basic APPRIS P1

Nt5m-204 ENSMUST00000154699.7 2220 No protein - lncRNA - - TSL:1

Nt5m-202 ENSMUST00000137695.1 1387 No protein - lncRNA - - TSL:2

Nt5m-203 ENSMUST00000149076.7 1316 No protein - lncRNA - - TSL:1

61.52 kb Forward strand

Genes (Comprehensive set... Nt5m-204 >lncRNA

Nt5m-203 >lncRNA

Nt5m-201 >protein coding

Nt5m-202 >lncRNA

Contigs < AC068808.20

Genes < Cops3-201protein coding (Comprehensive set...

< Cops3-208protein coding

< Cops3-202lncRNA

< Cops3-206lncRNA < Cops3-205lncRNA

< Cops3-204lncRNA

< Cops3-207protein coding

Regulatory Build

Reverse strand 61.52 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000102695

28.46 kb Forward strand

Nt5m-201 >protein coding

ENSMUSP00000099... Low complexity (Seg) Superfamily HAD-like superfamily SFLD SFLDG01126

SFLDG01145 Pfam 5'(3')-deoxyribonucleotidase PANTHER PTHR16504:SF6

PTHR16504 Gene3D 1.10.40.40

HAD superfamily CDD cd02587

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 200 220

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7