https://www.alphaknockout.com

Mouse Fchsd2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Fchsd2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Fchsd2 (NCBI Reference Sequence: NM_199012 ; Ensembl: ENSMUSG00000030691 ) is located on Mouse 7. 21 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 21 (Transcript: ENSMUST00000032931). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Fchsd2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-452J21 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 0.96% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 1699 bp, and the size of intron 2 for 3'-loxP site insertion: 28314 bp. The size of effective cKO region: ~598 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 21 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Fchsd2 cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7098bp) | A(22.57% 1602) | C(24.99% 1774) | T(28.53% 2025) | G(23.91% 1697)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 101107490 101110489 3000 browser details YourSeq 27 959 1007 3000 86.3% chr4 + 135673085 135673131 47 browser details YourSeq 25 1206 1233 3000 96.3% chr8 + 16429783 16429812 30 browser details YourSeq 24 2569 2592 3000 100.0% chr2 - 110198693 110198716 24 browser details YourSeq 23 1383 1410 3000 84.0% chr5 + 151695116 151695141 26 browser details YourSeq 23 2646 2668 3000 100.0% chr11 + 75510940 75510962 23 browser details YourSeq 22 1022 1043 3000 100.0% chr14 - 74746050 74746071 22 browser details YourSeq 22 2484 2505 3000 100.0% chrX + 105597526 105597547 22

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 101111088 101114087 3000 browser details YourSeq 671 649 1547 3000 96.8% chr5 - 51141595 51142484 890 browser details YourSeq 660 800 1547 3000 96.8% chr2 - 66359281 66360167 887 browser details YourSeq 659 801 1542 3000 96.0% chr10 + 4501316 4502060 745 browser details YourSeq 623 827 1545 3000 96.5% chr5 - 18740518 18741228 711 browser details YourSeq 613 807 1547 3000 96.0% chr17 + 32900360 32901086 727 browser details YourSeq 601 901 1547 3000 98.0% chr2 + 146572575 146573379 805 browser details YourSeq 591 800 1566 3000 95.7% chr15 - 24894367 24895060 694 browser details YourSeq 589 557 1546 3000 95.5% chrX - 46461167 46461994 828 browser details YourSeq 584 570 1547 3000 95.1% chr1 + 62864893 62865539 647 browser details YourSeq 583 804 1547 3000 95.8% chr8 - 24757356 24757984 629 browser details YourSeq 580 803 1547 3000 96.3% chr5 + 103329239 103329847 609 browser details YourSeq 578 964 1564 3000 98.2% chr17 - 40913070 40913664 595 browser details YourSeq 574 801 1547 3000 97.7% chrX + 94571045 94571874 830 browser details YourSeq 574 807 1547 3000 96.0% chr3 + 67888060 67888657 598 browser details YourSeq 573 963 1547 3000 98.7% chr3 + 14739664 14740247 584 browser details YourSeq 572 800 1547 3000 95.5% chr9 - 12847301 12847915 615 browser details YourSeq 570 800 1547 3000 95.5% chr5 - 85565841 85566495 655 browser details YourSeq 570 969 1547 3000 99.4% chr7 + 75486677 75487257 581 browser details YourSeq 570 800 1547 3000 95.5% chr1 + 119880905 119881510 606

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Fchsd2 FCH and double SH3 domains 2 [ Mus musculus (house mouse) ] Gene ID: 207278, updated on 12-Aug-2019

Gene summary

Official Symbol Fchsd2 provided by MGI Official Full Name FCH and double SH3 domains 2 provided by MGI Primary source MGI:MGI:2448475 See related Ensembl:ENSMUSG00000030691 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as NWK1; R74866; Sh3md3; BC034086; mKIAA0769 Expression Ubiquitous expression in CNS E18 (RPKM 15.4), whole brain E14.5 (RPKM 12.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 7; 7 E2 See Fchsd2 in Genome Data Viewer

Exon count: 21

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (101108499..101284405)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (108257289..108432919)

Chromosome 7 - NC_000073.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 11 transcripts

Gene: Fchsd2 ENSMUSG00000030691

Description FCH and double SH3 domains 2 [Source:MGI Symbol;Acc:MGI:2448475] Gene Synonyms Sh3md3 Location Chromosome 7: 101,092,863-101,284,405 forward strand. GRCm38:CM001000.2 About this gene This gene has 11 transcripts (splice variants), 276 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Fchsd2-201 ENSMUST00000032931.8 4453 764aa ENSMUSP00000032931.7 Protein coding CCDS52327 Q3USJ8 TSL:1 GENCODE basic APPRIS P4

Fchsd2-202 ENSMUST00000098250.9 4316 740aa ENSMUSP00000095850.3 Protein coding CCDS52328 Q3USJ8 TSL:1 GENCODE basic APPRIS ALT1

Fchsd2-209 ENSMUST00000208439.1 588 179aa ENSMUSP00000146962.1 Protein coding - A0A140LIU6 CDS 3' incomplete TSL:5

Fchsd2-206 ENSMUST00000145802.7 3133 No protein - Retained intron - - TSL:2

Fchsd2-210 ENSMUST00000208638.1 3006 No protein - Retained intron - - TSL:NA

Fchsd2-211 ENSMUST00000208917.1 2685 No protein - Retained intron - - TSL:NA

Fchsd2-203 ENSMUST00000130426.1 2235 No protein - Retained intron - - TSL:2

Fchsd2-208 ENSMUST00000208063.1 2188 No protein - Retained intron - - TSL:NA

Fchsd2-207 ENSMUST00000151693.1 607 No protein - Retained intron - - TSL:3

Fchsd2-204 ENSMUST00000137196.7 1155 No protein - lncRNA - - TSL:1

Fchsd2-205 ENSMUST00000142727.1 528 No protein - lncRNA - - TSL:3

Page 6 of 8 https://www.alphaknockout.com

211.54 kb Forward strand 101.10Mb 101.15Mb 101.20Mb 101.25Mb (Comprehensive set... Fchsd2-209 >protein coding Gm47324-201 >processed pseudogene

Fchsd2-201 >protein coding

Fchsd2-202 >protein coding

Fchsd2-204 >lncRNA Fchsd2-207 >retained intron

Fchsd2-206 >retained intron Fchsd2-210 >retained intron

Fchsd2-205 >lncRNA Fchsd2-211 >retained intron

Fchsd2-203 >retained intron Fchsd2-208 >retained intron

Contigs < AC150313.7 < AC107638.18

Genes < Gm6341-201processed pseudogene < Atg16l2-216nonsense mediated decay (Comprehensive set...

< Atg16l2-211nonsense mediated decay

< Atg16l2-208nonsense mediated decay

< Atg16l2-205retained intron

< Atg16l2-206retained intron

< Atg16l2-204retained intron

< Atg16l2-207retained intron

< Atg16l2-214retained intron

< Atg16l2-213retained intron

< Atg16l2-212retained intron

< Atg16l2-201protein coding

< Atg16l2-202protein coding

< Atg16l2-209lncRNA

Regulatory Build

101.10Mb 101.15Mb 101.20Mb 101.25Mb Reverse strand 211.54 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000032931

175.67 kb Forward strand

Fchsd2-201 >protein coding

ENSMUSP00000032... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily AH/BAR domain superfamily SH3-like domain superfamily

SMART FCH domain SH3 domain

Prints SH3 domain Pfam FCH domain SH3 domain SH3 domain

PROSITE profiles F-BAR domain SH3 domain

PANTHER PTHR15735

F-BAR and double SH3 domains protein 2 Gene3D AH/BAR domain superfamily 2.30.30.40

CDD cd07677 FCHSD2, SH3 domain 2

FCHSD, SH3 domain 1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 764

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8