http://www.alphaknockout.com/

Mouse Ythdf2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Ythdf2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ythdf2 (NCBI Reference Sequence: NM_145393 ; Ensembl: ENSMUSG00000040025 ) is located on Mouse 4. 5 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 5 (Transcript: ENSMUST00000152796). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Ythdf2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-127O5 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis.

Note: Mice homozgyous for a knock-out allele exhibit female, but not male, infertility and preweaning lethality that is background sensitive.

The knockout of Exon 3 will result in frameshift of the gene, and covers 4.61% of the coding region. The size of intron 2 for 5'-loxP site insertion: 600 bp, and the size of intron 3 for 3'-loxP site insertion: 5036 bp. The size of effective cKO region: ~579 bp. The cKO region does not have any other known gene.

Page 1 of 7 http://www.alphaknockout.com/

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 5 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Ythdf2 cKO region loxP site

Page 2 of 7 http://www.alphaknockout.com/

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7080bp) | A(23.01% 1629) | C(22.44% 1589) | G(23.97% 1697) | T(30.58% 2165)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 http://www.alphaknockout.com/

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 - 132211082 132214081 3000 browser details YourSeq 207 502 852 3000 91.9% chr15 + 19855351 19855846 496 browser details YourSeq 202 1823 2182 3000 80.9% chrX - 70422428 70422786 359 browser details YourSeq 202 502 802 3000 93.2% chr2 + 31737037 32144506 407470 browser details YourSeq 202 502 833 3000 91.8% chr2 + 20511102 20511542 441 browser details YourSeq 193 502 817 3000 89.5% chr11 - 77620884 77621701 818 browser details YourSeq 193 1823 2182 3000 79.9% chrX + 70517445 70517803 359 browser details YourSeq 191 502 774 3000 94.0% chr14 - 30787914 30791713 3800 browser details YourSeq 191 525 819 3000 90.4% chr17 + 25916041 25916552 512 browser details YourSeq 184 503 814 3000 93.8% chr16 - 21023223 21131866 108644 browser details YourSeq 183 531 788 3000 91.9% chr13 - 38659308 38659786 479 browser details YourSeq 165 525 786 3000 95.1% chr17 - 74504139 74510485 6347 browser details YourSeq 165 505 797 3000 87.7% chr1 - 155407618 155408279 662 browser details YourSeq 165 502 798 3000 92.8% chr12 + 70606907 70607468 562 browser details YourSeq 159 527 813 3000 88.5% chr19 + 21113014 21113433 420 browser details YourSeq 155 502 687 3000 92.0% chr7 + 55959290 55959471 182 browser details YourSeq 153 512 745 3000 93.4% chr14 - 52335326 52336220 895 browser details YourSeq 151 505 678 3000 95.3% chrX + 16534794 16534976 183 browser details YourSeq 150 502 665 3000 95.8% chr11 + 61221328 61221491 164 browser details YourSeq 147 502 662 3000 96.3% chr2 + 29128433 29128596 164

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 - 132207502 132210501 3000 browser details YourSeq 208 1263 2770 3000 95.7% chr11 + 74846474 75005611 159138 browser details YourSeq 178 1348 2761 3000 91.6% chr16 + 32441895 32480734 38840 browser details YourSeq 157 2590 2775 3000 94.9% chr10 + 42519757 42519951 195 browser details YourSeq 153 2590 2770 3000 94.3% chr11 + 60893634 60893822 189 browser details YourSeq 150 2589 2751 3000 97.0% chr6 + 24359299 24359468 170 browser details YourSeq 148 2591 2770 3000 91.2% chr17 - 34207246 34207422 177 browser details YourSeq 147 2592 2751 3000 97.5% chrX - 157247681 157247856 176 browser details YourSeq 147 2587 2749 3000 96.9% chr13 + 117349295 117349467 173 browser details YourSeq 147 2588 2749 3000 96.3% chr12 + 108736514 108737104 591 browser details YourSeq 146 2589 2764 3000 94.6% chr12 + 94365627 94365806 180 browser details YourSeq 145 2592 2768 3000 91.9% chr10 - 37019022 37019198 177 browser details YourSeq 145 2592 2749 3000 96.8% chr12 + 24813454 24813612 159 browser details YourSeq 144 2598 2769 3000 92.9% chr17 - 29932610 29932782 173 browser details YourSeq 143 2599 2760 3000 95.0% chr19 - 16153401 16153726 326 browser details YourSeq 143 2591 2759 3000 94.6% chr12 - 60179504 60179678 175 browser details YourSeq 142 2590 2751 3000 93.2% chr11 + 20612260 20612420 161 browser details YourSeq 140 2574 2747 3000 94.3% chr13 - 20170892 20171104 213 browser details YourSeq 140 2589 2751 3000 95.5% chr1 + 73304664 73305271 608 browser details YourSeq 139 2597 2750 3000 94.8% chr17 - 35208865 35209017 153

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 http://www.alphaknockout.com/ Gene and information: Ythdf2 YTH N6-methyladenosine RNA binding protein 2 [ Mus musculus (house mouse) ] Gene ID: 213541, updated on 26-Jun-2020

Gene summary

Official Symbol Ythdf2 provided by MGI Official Full Name YTH N6-methyladenosine RNA binding protein 2 provided by MGI Primary source MGI:MGI:2444233 See related Ensembl:ENSMUSG00000040025 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as HGRG8; NY-REN-2; 9430020E02Rik Expression Ubiquitous expression in CNS E11.5 (RPKM 11.2), whole brain E14.5 (RPKM 10.9) and 28 other tissues See more Orthologs human all

Genomic context

Location: 4; 4 D2.3 See Ythdf2 in Genome Data Viewer

Exon count: 5

Annotation release Status Assembly Chr Location

108.20200622 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (132184916..132212256, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (131740831..131768171, complement)

Chromosome 4 - NC_000070.6

Page 5 of 7 http://www.alphaknockout.com/

Transcript information: This gene has 4 transcripts

Gene: Ythdf2 ENSMUSG00000040025

Description YTH N6-methyladenosine RNA binding protein 2 [Source:MGI Symbol;Acc:MGI:2444233] Gene Synonyms 9430020E02Rik, HGRG8, NY-REN-2 Location Chromosome 4: 132,184,912-132,212,303 reverse strand. GRCm38:CM000997.2 About this gene This gene has 4 transcripts (splice variants), 273 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ythdf2-203 ENSMUST00000152796.7 4129 579aa ENSMUSP00000120414.1 Protein coding CCDS18719 Q91YT7 TSL:1 GENCODE basic APPRIS P1

Ythdf2-201 ENSMUST00000085181.4 955 51aa ENSMUSP00000082275.4 Protein coding - E9PW90 TSL:2 GENCODE basic

Ythdf2-204 ENSMUST00000165072.1 520 100aa ENSMUSP00000129225.1 Protein coding - F6Y6H7 CDS 5' incomplete TSL:3

Ythdf2-202 ENSMUST00000102570.4 2536 No protein - Retained intron - - TSL:5

47.39 kb Forward strand

132.18Mb 132.19Mb 132.20Mb 132.21Mb 132.22Mb Contigs BX537301.12 > (Comprehensive set... < Ythdf2-203protein coding < Rps15a-ps4-202processed transcript

< Ythdf2-201protein coding < Gmeb1-207protein coding

< Ythdf2-204protein coding < Gm22033-201miRNA

< Ythdf2-202retained intron

< Rps15a-ps4-201transcribed processed pseudogene

Regulatory Build

132.18Mb 132.19Mb 132.20Mb 132.21Mb 132.22Mb Reverse strand 47.39 kb

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene RNA gene processed transcript

Regulation Legend

CTCF Open Chromatin Promoter Promoter Flank

Page 6 of 7 http://www.alphaknockout.com/

Transcript: ENSMUST00000152796

< Ythdf2-203protein coding

Reverse strand 27.39 kb

ENSMUSP00000120... MobiDB lite Low complexity (Seg) Pfam YTH domain PROSITE profiles YTH domain PANTHER PTHR12357:SF8

PTHR12357 Gene3D 3.10.590.10

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

stop gained missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 579

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

Page 7 of 7