https://www.alphaknockout.com

Mouse Tspyl2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tspyl2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tspyl2 (NCBI Reference Sequence: NM_029836 ; Ensembl: ENSMUSG00000041096 ) is located on Mouse X. 7 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 7 (Transcript: ENSMUST00000044509). Exon 2~6 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tspyl2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-147K15 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous or hemizygous for a targeted allele are viable, fertile and do not exhibit increased tumor incidence; however, mutant mouse embryonic fibroblasts are impaired in cell cycle arrest following DNA damage induced by ionizing radiation.

Exon 2 starts from about 38.75% of the coding region. The knockout of Exon 2~6 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 1159 bp, and the size of intron 6 for 3'-loxP site insertion: 1159 bp. The size of effective cKO region: ~2156 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Tspyl2 cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8656bp) | A(26.54% 2297) | C(24.15% 2090) | T(24.66% 2135) | G(24.65% 2134)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chrX - 152340631 152343630 3000 browser details YourSeq 280 1647 1948 3000 96.7% chr14 - 89371279 89371584 306 browser details YourSeq 44 1776 1892 3000 75.0% chr1 + 135075629 135075727 99

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chrX - 152335475 152338474 3000 browser details YourSeq 857 1994 3000 3000 93.4% chr9 + 60701613 60702612 1000 browser details YourSeq 856 2002 3000 3000 93.3% chr17 - 4682171 4683153 983 browser details YourSeq 849 2019 3000 3000 93.7% chr5 - 32418939 32419916 978 browser details YourSeq 843 1996 3000 3000 92.7% chr5 + 127530970 127531958 989 browser details YourSeq 841 1995 3000 3000 93.9% chr8 - 4112596 4113696 1101 browser details YourSeq 841 2008 3000 3000 92.9% chr7 - 128827990 128828966 977 browser details YourSeq 840 1984 3000 3000 92.8% chr5 + 137878621 137879821 1201 browser details YourSeq 839 1996 3000 3000 92.1% chr8 + 19666538 19667525 988 browser details YourSeq 837 1995 3000 3000 92.2% chr4 - 32213800 32214781 982 browser details YourSeq 837 2009 2996 3000 92.8% chr14 - 34113272 34114238 967 browser details YourSeq 836 1995 3000 3000 92.8% chr6 - 147669183 147670164 982 browser details YourSeq 835 2021 3000 3000 92.9% chr13 + 41798316 41799275 960 browser details YourSeq 835 2021 3000 3000 92.8% chr13 + 34262731 34263691 961 browser details YourSeq 833 1998 3000 3000 94.0% chrX - 168528824 168751150 222327 browser details YourSeq 833 2000 3000 3000 92.7% chr16 - 90651331 90652320 990 browser details YourSeq 833 2019 3000 3000 92.7% chr13 - 115145743 115146707 965 browser details YourSeq 831 1994 3000 3000 92.1% chr6 - 32501372 32502350 979 browser details YourSeq 831 2022 2999 3000 92.3% chr18 - 72958500 72959458 959 browser details YourSeq 831 1997 3000 3000 92.2% chr16 + 73789888 73790863 976

Note: The 3000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Tspyl2 TSPY-like 2 [ Mus musculus (house mouse) ] Gene ID: 52808, updated on 13-Aug-2019

Gene summary

Official Symbol Tspyl2 provided by MGI Official Full Name TSPY-like 2 provided by MGI Primary source MGI:MGI:106244 See related Ensembl:ENSMUSG00000041096 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as CDA1; CINAP; DENTT; DXBwg1396e; DXHXS1008E; E130307F10Rik Expression Ubiquitous expression in cerebellum adult (RPKM 34.8), frontal lobe adult (RPKM 34.3) and 28 other tissues See more Orthologs human all

Genomic context

Location: X F3; X 68.46 cM See Tspyl2 in Genome Data Viewer

Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) X NC_000086.7 (152336851..152342484, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) X NC_000086.6 (148771395..148777027, complement)

Chromosome X - NC_000086.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Tspyl2 ENSMUSG00000041096

Description TSPY-like 2 [Source:MGI Symbol;Acc:MGI:106244] Gene Synonyms CINAP, DENTT, DXBwg1396e, DXHXS1008E, E130307F10Rik Location Chromosome X: 152,336,852-152,342,425 reverse strand. GRCm38:CM001013.2 About this gene This gene has 5 transcripts (splice variants), 188 orthologues, 10 paralogues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tspyl2- ENSMUST00000044509.6 2696 677aa ENSMUSP00000046782.6 Protein coding CCDS30474 Q7TQI8 TSL:1 201 GENCODE basic APPRIS P1

Tspyl2- ENSMUST00000178165.1 841 120aa ENSMUSP00000137121.1 Nonsense mediated - J3QP69 CDS 5' 205 decay incomplete TSL:5

Tspyl2- ENSMUST00000149685.7 2804 No - Retained intron - - TSL:5 203 protein

Tspyl2- ENSMUST00000138918.7 2658 No - Retained intron - - TSL:5 202 protein

Tspyl2- ENSMUST00000177748.1 391 No - Retained intron - - TSL:3 204 protein

25.57 kb Forward strand 152.33Mb 152.34Mb 152.35Mb Contigs AL731727.18 > (Comprehensive set... < Kantr-207protein coding < Tspyl2-202retained intron < Gpr173-201protein coding

< Kantr-203lncRNA < Tspyl2-203retained intron

< Kantr-204lncRNA < Tspyl2-201protein coding

< Kantr-206lncRNA < Tspyl2-205nonsense mediated decay

< Kantr-202lncRNA < Tspyl2-204retained intron

< Kantr-205lncRNA

Regulatory Build

152.33Mb 152.34Mb 152.35Mb Reverse strand 25.57 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000044509

< Tspyl2-201protein coding

Reverse strand 5.57 kb

ENSMUSP00000046... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily NAP-like superfamily Pfam Nucleosome assembly protein (NAP) PANTHER PTHR11875:SF72

Nucleosome assembly protein (NAP) Gene3D 3.30.1120.90

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe insertion missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 677

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7