https://www.alphaknockout.com

Mouse Cyfip1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Cyfip1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cyfip1 (NCBI Reference Sequence: NM_011370 ; Ensembl: ENSMUSG00000030447 ) is located on Mouse 7. 31 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 31 (Transcript: ENSMUST00000032629). Exon 4~6 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Cyfip1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-7P16 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mutations at this locus result in embryonic lethality before the turning stage in homozygotes. Heterozygotes exhibit abnormal synaptic transmission. Parental origin of the mutant allele in heterozygotes has an effect on long term depression, cued fear conditioning, anxiety, and activity.

Exon 4 starts from about 5.53% of the coding region. The knockout of Exon 4~6 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 1289 bp, and the size of intron 6 for 3'-loxP site insertion: 2602 bp. The size of effective cKO region: ~2143 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 31 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Cyfip1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8643bp) | A(24.98% 2159) | C(22.23% 1921) | T(30.26% 2615) | G(22.54% 1948)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 55870197 55873196 3000 browser details YourSeq 197 1118 1450 3000 88.2% chr8 + 119316009 119316368 360 browser details YourSeq 164 1094 1450 3000 83.2% chr1 - 192067073 192067448 376 browser details YourSeq 156 1134 1450 3000 89.2% chr5 - 147622340 147622681 342 browser details YourSeq 156 1112 1449 3000 85.3% chr14 + 19863423 19863791 369 browser details YourSeq 153 1116 1457 3000 85.8% chr13 - 92192886 92193233 348 browser details YourSeq 152 1134 1446 3000 81.7% chr10 + 61556665 61556958 294 browser details YourSeq 151 1135 1456 3000 87.7% chr7 + 36682983 36683334 352 browser details YourSeq 150 1158 1447 3000 81.1% chr15 + 36941353 36941664 312 browser details YourSeq 146 1093 1447 3000 87.3% chr6 + 37710160 37710572 413 browser details YourSeq 144 1134 1450 3000 88.3% chr8 - 28326024 28326358 335 browser details YourSeq 139 1191 1450 3000 81.3% chr7 - 56024551 56024814 264 browser details YourSeq 138 1134 1450 3000 85.7% chr9 - 22235364 22235700 337 browser details YourSeq 137 1134 1450 3000 88.6% chr6 - 5427163 5427500 338 browser details YourSeq 136 1136 1447 3000 87.8% chr16 - 55624853 55625178 326 browser details YourSeq 136 1107 1451 3000 89.1% chr17 + 64857073 64857447 375 browser details YourSeq 135 1112 1446 3000 83.7% chr18 + 74700872 74701207 336 browser details YourSeq 135 1188 1450 3000 85.7% chr16 + 34617024 34617293 270 browser details YourSeq 133 1188 1450 3000 81.5% chr9 + 21861375 21861644 270 browser details YourSeq 131 1100 1443 3000 88.9% chr2 - 69595775 69637156 41382

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 55875340 55878339 3000 browser details YourSeq 163 11 236 3000 90.6% chr11 + 97851270 98046378 195109 browser details YourSeq 157 23 236 3000 91.1% chr11 - 59128639 59128857 219 browser details YourSeq 150 11 236 3000 90.9% chr7 - 49499280 49499527 248 browser details YourSeq 148 62 245 3000 92.0% chr5 - 134164180 134164379 200 browser details YourSeq 146 18 228 3000 94.6% chr16 - 32354857 32692253 337397 browser details YourSeq 146 60 251 3000 90.7% chr13 - 54853014 54853221 208 browser details YourSeq 145 60 248 3000 91.5% chr8 - 3559002 3559196 195 browser details YourSeq 144 60 236 3000 89.4% chr2 - 168656162 168656333 172 browser details YourSeq 144 60 239 3000 89.2% chr12 - 78830244 78830416 173 browser details YourSeq 139 60 238 3000 90.7% chr5 - 143496450 143496860 411 browser details YourSeq 139 60 237 3000 88.8% chr2 - 39088098 39088272 175 browser details YourSeq 139 60 237 3000 86.4% chr1 - 118262532 118262701 170 browser details YourSeq 139 56 234 3000 89.5% chr18 + 9713669 9713842 174 browser details YourSeq 139 69 236 3000 92.7% chr12 + 105736961 105737490 530 browser details YourSeq 139 60 235 3000 88.7% chr1 + 37637196 37637366 171 browser details YourSeq 138 58 235 3000 90.6% chr2 - 92959336 92959520 185 browser details YourSeq 138 60 249 3000 89.3% chr17 + 45499621 45499823 203 browser details YourSeq 138 60 236 3000 87.0% chr11 + 106758662 106758832 171 browser details YourSeq 137 72 236 3000 90.8% chr15 + 85881247 85881410 164

Note: The 3000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Cyfip1 cytoplasmic FMR1 interacting protein 1 [ Mus musculus (house mouse) ] Gene ID: 20430, updated on 22-Oct-2019

Gene summary

Official Symbol Cyfip1 provided by MGI Official Full Name cytoplasmic FMR1 interacting protein 1 provided by MGI Primary source MGI:MGI:1338801 See related Ensembl:ENSMUSG00000030447 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Shyc; Sra1; pl-1; Sra-1; l71Rl; l7Rl1; l(7)1Rl; P140sra1; P140SRA-1; mKIAA0068; E030028J09Rik Expression Ubiquitous expression in bladder adult (RPKM 10.5), limb E14.5 (RPKM 10.3) and 28 other tissues See more Orthologs human all

Genomic context

Location: 7; 7 B5 See Cyfip1 in Genome Data Viewer

Exon count: 33

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (55842022..55932633)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (63097441..63188003)

Chromosome 7 - NC_000073.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 12 transcripts

Gene: Cyfip1 ENSMUSG00000030447

Description cytoplasmic FMR1 interacting protein 1 [Source:MGI Symbol;Acc:MGI:1338801] Gene Synonyms E030028J09Rik, P140SRA-1, Shyc, Sra-1, l(7)1Rl, l7Rl1, pl-1 Location Chromosome 7: 55,841,745-55,932,602 forward strand. GRCm38:CM001000.2 About this gene This gene has 12 transcripts (splice variants), 201 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 8 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cyfip1-201 ENSMUST00000032629.15 6440 1253aa ENSMUSP00000032629.9 Protein coding CCDS21315 Q7TMB8 TSL:1 GENCODE basic APPRIS P1

Cyfip1-202 ENSMUST00000085255.10 4195 1251aa ENSMUSP00000082353.4 Protein coding CCDS52262 A0A0R4J119 TSL:1 GENCODE basic

Cyfip1-203 ENSMUST00000163845.3 4178 1253aa ENSMUSP00000127717.2 Protein coding CCDS21315 Q7TMB8 TSL:1 GENCODE basic APPRIS P1

Cyfip1-212 ENSMUST00000206862.1 2908 969aa ENSMUSP00000146194.1 Protein coding - A0A0U1RQ05 CDS 3' incomplete TSL:5

Cyfip1-208 ENSMUST00000173783.7 926 229aa ENSMUSP00000134509.1 Protein coding - G3UZI5 CDS 3' incomplete TSL:3

Cyfip1-205 ENSMUST00000173267.7 5111 No protein - Retained intron - - TSL:1

Cyfip1-207 ENSMUST00000173497.7 3849 No protein - Retained intron - - TSL:1

Cyfip1-204 ENSMUST00000168271.8 1876 No protein - Retained intron - - TSL:1

Cyfip1-211 ENSMUST00000205656.1 1588 No protein - Retained intron - - TSL:NA

Cyfip1-209 ENSMUST00000174660.7 719 No protein - Retained intron - - TSL:2

Cyfip1-206 ENSMUST00000173384.1 629 No protein - Retained intron - - TSL:3

Cyfip1-210 ENSMUST00000174793.1 466 No protein - Retained intron - - TSL:2

Page 6 of 8 https://www.alphaknockout.com

110.86 kb Forward strand 55.84Mb 55.86Mb 55.88Mb 55.90Mb 55.92Mb 55.94Mb (Comprehensive set... Cyfip1-204 >retained intron Cyfip1-209 >retained intron Cyfip1-211 >retained intron

Cyfip1-201 >protein coding

Cyfip1-208 >protein coding Cyfip1-205 >retained intron

Cyfip1-202 >protein coding

Cyfip1-207 >retained intron

Cyfip1-203 >protein coding

Cyfip1-212 >protein coding

Cyfip1-206 >retained intron

Cyfip1-210 >retained intron

Contigs AC102298.14 > < AC144633.3 Genes < Gm17907-201processed pseudogene < Nipa2-206retained intron (Comprehensive set...

< Nipa2-207retained intron

< Nipa2-211retained intron

< Gm44616-201TEC

< Nipa2-201protein coding

< Nipa2-204protein coding

< Nipa2-202protein coding

< Nipa2-203protein coding

< Nipa2-210protein coding

< Nipa2-205protein coding

< Nipa2-209lncRNA

Regulatory Build

55.84Mb 55.86Mb 55.88Mb 55.90Mb 55.92Mb 55.94Mb Reverse strand 110.86 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene pseudogene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000032629

90.53 kb Forward strand

Cyfip1-201 >protein coding

ENSMUSP00000032... Low complexity (Seg) Prints Cytoplasmic FMR1-interacting Pfam Cytoplasmic FMR1-interacting

Protein of unknown function DUF1394 PIRSF Cytoplasmic FMR1-interacting PANTHER Cytoplasmic FMR1-interacting

PTHR12195:SF2

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1253

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8