https://www.alphaknockout.com

Mouse Cstf2t Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Cstf2t conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cstf2t (NCBI Reference Sequence: NM_031249 ; Ensembl: ENSMUSG00000053536 ) is located on Mouse 19. 1 exon is identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 1 (Transcript: ENSMUST00000066039). Exon 1 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cstf2t gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-139N11 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Male mice homozygous for a targeted null allele are infertile due to low sperm counts, significant developmental defects in spermiogenesis, and variable abnormalities in epididymal sperm morphology and motility consistent with oligoasthenoteratozoospermia. Homozygous null females are fertile.

Exon 1 covers 100.0% of the coding region. Start codon is in exon 1, and stop codon is in exon 1. The size of effective cKO region: ~1929 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

gRNA region gRNA region

Wildtype allele A T G T G 5' A 3'

1

Targeting vector A T G T G A

Targeted allele A T G T G A

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Cstf2t cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7896bp) | A(28.23% 2229) | C(19.81% 1564) | T(28.99% 2289) | G(22.97% 1814)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 + 31080067 31083066 3000 browser details YourSeq 317 761 1276 3000 94.9% chr11 + 77650851 77651381 531 browser details YourSeq 317 761 1274 3000 89.4% chr11 + 76270771 76271161 391 browser details YourSeq 301 761 1425 3000 89.2% chr7 - 127518954 127519336 383 browser details YourSeq 290 761 1274 3000 94.5% chr4 - 119532601 119533214 614 browser details YourSeq 287 761 1275 3000 91.2% chr11 - 103757055 103757504 450 browser details YourSeq 286 767 1275 3000 88.4% chr15 - 83249011 83249343 333 browser details YourSeq 284 763 1265 3000 90.4% chr10 - 61966443 61966767 325 browser details YourSeq 281 766 1262 3000 90.0% chr5 - 33663109 33663514 406 browser details YourSeq 281 761 1273 3000 88.3% chr4 + 136060365 136060698 334 browser details YourSeq 271 761 1269 3000 94.5% chr2 - 119206073 119206651 579 browser details YourSeq 267 761 1239 3000 93.3% chr12 + 72726838 72727491 654 browser details YourSeq 255 761 1266 3000 94.2% chr11 - 4864170 4864774 605 browser details YourSeq 254 761 1264 3000 87.3% chr19 - 4258845 4259177 333 browser details YourSeq 243 761 1250 3000 87.9% chr16 - 48282335 48282689 355 browser details YourSeq 243 763 1233 3000 93.9% chr9 + 73080294 73080959 666 browser details YourSeq 220 761 1166 3000 91.4% chr15 + 99385821 99386213 393 browser details YourSeq 209 762 1061 3000 96.0% chr8 + 127169851 127170358 508 browser details YourSeq 198 761 1181 3000 93.5% chr10 - 80085546 80086199 654 browser details YourSeq 191 740 1151 3000 87.4% chr9 - 109963534 109963766 233

Note: The 3000 bp section upstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 + 31084963 31087962 3000 browser details YourSeq 27 1820 1863 3000 86.3% chr2 - 65632663 65632704 42 browser details YourSeq 23 70 103 3000 85.3% chr5 - 88672771 88672806 36

Note: The 3000 bp section downstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Cstf2t cleavage stimulation factor, 3' pre-RNA subunit 2, tau [ Mus musculus (house mouse) ] Gene ID: 83410, updated on 10-Oct-2019

Gene summary

Official Symbol Cstf2t provided by MGI Official Full Name cleavage stimulation factor, 3' pre-RNA subunit 2, tau provided by MGI Primary source MGI:MGI:1932622 See related Ensembl:ENSMUSG00000053536 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 64kDa; C77975; tCstF-64; tauCstF-64 Orthologs human all

Genomic context

Location: 19; 19 C1 See Cstf2t in Genome Data Viewer Exon count: 1

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 19 NC_000085.6 (31082841..31086592)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 19 NC_000085.5 (31157331..31161082)

Chromosome 19 - NC_000085.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Cstf2t ENSMUSG00000053536

Description cleavage stimulation factor, 3' pre-RNA subunit 2, tau [Source:MGI Symbol;Acc:MGI:1932622] Gene Synonyms 64kDa, tCstF-64 Location Chromosome 19: 31,082,837-31,087,069 forward strand. GRCm38:CM001012.2 About this gene This gene has 1 transcript (splice variant), 100 orthologues, 5 paralogues, is a member of 1 Ensembl protein family and is associated with 9 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cstf2t-201 ENSMUST00000066039.7 4233 632aa ENSMUSP00000093831.3 Protein coding CCDS37958 Q8C7E9 TSL:NA GENCODE basic APPRIS P1

24.23 kb Forward strand 31.075Mb 31.080Mb 31.085Mb 31.090Mb 31.095Mb (Comprehensive set... Cstf2t-201 >protein coding

Contigs < AC110180.10 Genes < Prkg1-201protein coding (Comprehensive set...

< Prkg1-202protein coding

Regulatory Build

31.075Mb 31.080Mb 31.085Mb 31.090Mb 31.095Mb Reverse strand 24.23 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000066039

4.23 kb Forward strand

Cstf2t-201 >protein coding

ENSMUSP00000093... MobiDB lite Low complexity (Seg) Superfamily RNA-binding domain superfamily SMART RNA recognition motif domain Pfam RNA recognition motif domain Transcription termination and cleavage factor, C-terminal domain

Cleavage stimulation factor subunit 2, hinge domain PROSITE profiles RNA recognition motif domain PANTHER PTHR45735:SF3

PTHR45735 Gene3D Nucleotide-binding alpha-beta plait domain superfamily CSTF2, C-terminal domain superfamily

CDD cd12671

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe insertion missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 632

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7