https://www.alphaknockout.com

Mouse Fchsd2 Knockout Project (CRISPR/Cas9)

Objective: To create a Fchsd2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Fchsd2 (NCBI Reference Sequence: NM_199012 ; Ensembl: ENSMUSG00000030691 ) is located on Mouse 7. 21 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 21 (Transcript: ENSMUST00000032931). Exon 5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 10.6% of the coding region. Exon 5 covers 6.33% of the coding region. The size of effective KO region: ~145 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 5 21

Legends Exon of mouse Fchsd2 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(30.25% 605) | C(18.15% 363) | T(32.75% 655) | G(18.85% 377)

Note: The 2000 bp section upstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(31.65% 633) | C(15.0% 300) | T(34.95% 699) | G(18.4% 368)

Note: The 2000 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 + 101184389 101186388 2000 browser details YourSeq 115 10 192 2000 82.7% chr13 + 104597858 104598040 183 browser details YourSeq 111 708 1189 2000 77.5% chr11 - 115454850 115455240 391 browser details YourSeq 101 708 1189 2000 73.9% chr12 + 8408723 8409094 372 browser details YourSeq 101 694 840 2000 87.0% chr11 + 59164830 59164986 157 browser details YourSeq 96 336 826 2000 83.4% chr11 - 75681876 75682453 578 browser details YourSeq 92 711 830 2000 88.4% chr3 - 37750860 37750979 120 browser details YourSeq 92 699 826 2000 86.0% chr4 + 21412923 21413050 128 browser details YourSeq 91 699 825 2000 85.9% chr9 - 58478626 58478752 127 browser details YourSeq 91 358 826 2000 74.2% chr10 - 51569131 51569354 224 browser details YourSeq 91 708 830 2000 84.2% chr11 + 5860569 5860688 120 browser details YourSeq 90 698 826 2000 84.2% chr11 + 117424305 117424432 128 browser details YourSeq 87 708 826 2000 86.6% chr7 + 20629990 20630108 119 browser details YourSeq 87 708 826 2000 88.5% chr10 + 78751259 78751377 119 browser details YourSeq 86 708 825 2000 86.5% chr11 - 30382396 30382513 118 browser details YourSeq 86 707 826 2000 85.9% chr11 + 41137946 41138065 120 browser details YourSeq 85 699 830 2000 82.6% chr14 - 54455811 54567017 111207 browser details YourSeq 85 708 826 2000 85.8% chr14 + 12804314 12804432 119 browser details YourSeq 83 708 830 2000 83.8% chr8 - 29880651 29880773 123 browser details YourSeq 83 708 826 2000 84.9% chr16 - 18887017 18887135 119

Note: The 2000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr7 + 101186534 101188533 2000 browser details YourSeq 90 147 307 2000 84.0% chr10 - 99616664 99616816 153 browser details YourSeq 87 160 307 2000 92.3% chr8 + 106621128 106621276 149 browser details YourSeq 86 171 309 2000 92.2% chrX - 120367751 120367889 139 browser details YourSeq 82 246 522 2000 88.7% chr2 + 112263884 112264321 438 browser details YourSeq 79 497 902 2000 90.8% chr11 - 95116251 95549972 433722 browser details YourSeq 79 146 309 2000 75.5% chr17 + 88242726 88242891 166 browser details YourSeq 78 172 307 2000 93.4% chr2 + 181177410 181177545 136 browser details YourSeq 78 175 309 2000 91.6% chr2 + 27033498 27033632 135 browser details YourSeq 77 155 309 2000 83.6% chr3 - 134756546 134756684 139 browser details YourSeq 76 169 307 2000 93.3% chr5 - 149195491 149195630 140 browser details YourSeq 76 171 302 2000 89.8% chr10 - 109274210 109274342 133 browser details YourSeq 76 170 307 2000 91.4% chr19 + 15643546 15643683 138 browser details YourSeq 76 147 309 2000 69.8% chr16 + 79356799 79356947 149 browser details YourSeq 75 171 309 2000 77.0% chr15 + 73161264 73161402 139 browser details YourSeq 74 419 885 2000 70.4% chr10 - 127235811 127235930 120 browser details YourSeq 73 149 289 2000 76.5% chr10 - 21458343 21458545 203 browser details YourSeq 73 278 892 2000 72.8% chr1 + 133649813 133650222 410 browser details YourSeq 72 148 308 2000 80.0% chr17 - 30473611 30473746 136 browser details YourSeq 72 148 307 2000 70.9% chr1 - 16000635 16000789 155

Note: The 2000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Fchsd2 FCH and double SH3 domains 2 [ Mus musculus (house mouse) ] Gene ID: 207278, updated on 12-Aug-2019

Gene summary

Official Symbol Fchsd2 provided by MGI Official Full Name FCH and double SH3 domains 2 provided by MGI Primary source MGI:MGI:2448475 See related Ensembl:ENSMUSG00000030691 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as NWK1; R74866; Sh3md3; BC034086; mKIAA0769 Expression Ubiquitous expression in CNS E18 (RPKM 15.4), whole brain E14.5 (RPKM 12.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 7; 7 E2 See Fchsd2 in Genome Data Viewer Exon count: 21

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (101108499..101284405)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (108257289..108432919)

Chromosome 7 - NC_000073.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 11 transcripts

Gene: Fchsd2 ENSMUSG00000030691

Description FCH and double SH3 domains 2 [Source:MGI Symbol;Acc:MGI:2448475] Gene Synonyms Sh3md3 Location Chromosome 7: 101,092,863-101,284,405 forward strand. GRCm38:CM001000.2 About this gene This gene has 11 transcripts (splice variants), 276 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Fchsd2-201 ENSMUST00000032931.8 4453 764aa ENSMUSP00000032931.7 Protein coding CCDS52327 Q3USJ8 TSL:1 GENCODE basic APPRIS P4

Fchsd2-202 ENSMUST00000098250.9 4316 740aa ENSMUSP00000095850.3 Protein coding CCDS52328 Q3USJ8 TSL:1 GENCODE basic APPRIS ALT1

Fchsd2-209 ENSMUST00000208439.1 588 179aa ENSMUSP00000146962.1 Protein coding - A0A140LIU6 CDS 3' incomplete TSL:5

Fchsd2-206 ENSMUST00000145802.7 3133 No protein - Retained intron - - TSL:2

Fchsd2-210 ENSMUST00000208638.1 3006 No protein - Retained intron - - TSL:NA

Fchsd2-211 ENSMUST00000208917.1 2685 No protein - Retained intron - - TSL:NA

Fchsd2-203 ENSMUST00000130426.1 2235 No protein - Retained intron - - TSL:2

Fchsd2-208 ENSMUST00000208063.1 2188 No protein - Retained intron - - TSL:NA

Fchsd2-207 ENSMUST00000151693.1 607 No protein - Retained intron - - TSL:3

Fchsd2-204 ENSMUST00000137196.7 1155 No protein - lncRNA - - TSL:1

Fchsd2-205 ENSMUST00000142727.1 528 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

211.54 kb Forward strand 101.10Mb 101.15Mb 101.20Mb 101.25Mb (Comprehensive set... Fchsd2-209 >protein coding Gm47324-201 >processed pseudogene

Fchsd2-201 >protein coding

Fchsd2-202 >protein coding

Fchsd2-204 >lncRNA Fchsd2-207 >retained intron

Fchsd2-206 >retained intron Fchsd2-210 >retained intron

Fchsd2-205 >lncRNA Fchsd2-211 >retained intron

Fchsd2-203 >retained intron Fchsd2-208 >retained intron

Contigs < AC150313.7 < AC107638.18

Genes < Gm6341-201processed pseudogene < Atg16l2-216nonsense mediated decay (Comprehensive set...

< Atg16l2-211nonsense mediated decay

< Atg16l2-208nonsense mediated decay

< Atg16l2-205retained intron

< Atg16l2-206retained intron

< Atg16l2-204retained intron

< Atg16l2-207retained intron

< Atg16l2-214retained intron

< Atg16l2-213retained intron

< Atg16l2-212retained intron

< Atg16l2-201protein coding

< Atg16l2-202protein coding

< Atg16l2-209lncRNA

Regulatory Build

101.10Mb 101.15Mb 101.20Mb 101.25Mb Reverse strand 211.54 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000032931

175.67 kb Forward strand

Fchsd2-201 >protein coding

ENSMUSP00000032... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily AH/BAR domain superfamily SH3-like domain superfamily

SMART FCH domain SH3 domain

Prints SH3 domain Pfam FCH domain SH3 domain SH3 domain

PROSITE profiles F-BAR domain SH3 domain

PANTHER PTHR15735

F-BAR and double SH3 domains protein 2 Gene3D AH/BAR domain superfamily 2.30.30.40

CDD cd07677 FCHSD2, SH3 domain 2

FCHSD, SH3 domain 1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 764

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9