https://www.alphaknockout.com

Mouse Clip2 Knockout Project (CRISPR/Cas9)

Objective: To create a Clip2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Clip2 (NCBI Reference Sequence: NM_009990 ; Ensembl: ENSMUSG00000063146 ) is located on Mouse 5. 17 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 17 (Transcript: ENSMUST00000100647). Exon 3~7 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous and heterozygous for disruptions in this gene display growth deficiency, brain abnormalities and hippocampal dysfunction and deficits in motor coordination.

Exon 3 starts from about 3.98% of the coding region. Exon 3~7 covers 38.14% of the coding region. The size of effective KO region: ~9147 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 7 17

Legends Exon of mouse Clip2 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 7 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(23.55% 471) | C(24.65% 493) | T(26.45% 529) | G(25.35% 507)

Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.55% 511) | C(23.25% 465) | T(24.55% 491) | G(26.65% 533)

Note: The 2000 bp section downstream of Exon 7 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr5 - 134523143 134525142 2000 browser details YourSeq 168 4 709 2000 94.3% chr9 + 64962319 64963297 979 browser details YourSeq 141 1 322 2000 94.4% chr15 + 102043257 102043846 590 browser details YourSeq 130 3 143 2000 97.2% chr10 + 25657841 25657981 141 browser details YourSeq 130 2 143 2000 96.5% chr1 + 72216561 72216704 144 browser details YourSeq 130 1 143 2000 95.8% chr1 + 51751809 51751951 143 browser details YourSeq 130 3 146 2000 95.2% chr1 + 45377933 45378076 144 browser details YourSeq 129 1 140 2000 96.5% chr1 + 26400949 26401089 141 browser details YourSeq 128 1 140 2000 94.3% chr7 - 45481270 45481408 139 browser details YourSeq 126 1 136 2000 96.4% chr14 + 20055706 20055841 136 browser details YourSeq 125 2 140 2000 94.8% chr5 + 150913276 150913412 137 browser details YourSeq 125 2 134 2000 97.0% chr1 + 6208686 6208818 133 browser details YourSeq 122 1 134 2000 94.0% chr7 - 45056121 45056253 133 browser details YourSeq 119 2 134 2000 93.2% chr1 + 95575840 95575971 132 browser details YourSeq 118 2 134 2000 92.4% chr1 - 19799191 19799321 131 browser details YourSeq 117 2 139 2000 93.2% chr2 + 127833515 127833651 137 browser details YourSeq 114 3 150 2000 89.2% chr5 - 109703776 109703921 146 browser details YourSeq 114 2 134 2000 90.9% chr8 + 79725681 79725811 131 browser details YourSeq 114 1 134 2000 91.0% chr8 + 27797604 27797736 133 browser details YourSeq 114 1 134 2000 91.0% chr16 + 15700274 15700406 133

Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr5 - 134511996 134513995 2000 browser details YourSeq 90 734 1029 2000 83.2% chr1 + 121472829 121473071 243 browser details YourSeq 78 967 1218 2000 83.2% chr3 + 115717442 115717838 397 browser details YourSeq 74 869 1028 2000 90.3% chr2 - 179953958 179954125 168 browser details YourSeq 72 898 1043 2000 90.2% chr2 + 35469818 35469978 161 browser details YourSeq 66 904 1000 2000 88.6% chr6 - 145971670 145972044 375 browser details YourSeq 66 869 1001 2000 90.3% chr2 - 179988466 179988609 144 browser details YourSeq 66 911 1008 2000 94.7% chr11 + 21284476 21284585 110 browser details YourSeq 65 911 1075 2000 94.6% chr4 - 99391065 99391247 183 browser details YourSeq 65 862 998 2000 83.6% chr12 - 45055394 45055542 149 browser details YourSeq 64 905 1028 2000 87.5% chr8 + 15320733 15320855 123 browser details YourSeq 62 879 1082 2000 79.9% chr11 + 16970276 16970487 212 browser details YourSeq 61 905 1034 2000 75.7% chrX - 106234862 106234992 131 browser details YourSeq 60 967 1091 2000 77.6% chr10 - 61678956 61679078 123 browser details YourSeq 60 870 1149 2000 81.2% chr11 + 70495638 70495915 278 browser details YourSeq 60 922 1091 2000 88.5% chr1 + 37043247 37043423 177 browser details YourSeq 57 227 393 2000 91.2% chr11 - 84074647 84074821 175 browser details YourSeq 56 923 1018 2000 95.3% chr2 + 144191588 144191701 114 browser details YourSeq 53 905 1107 2000 83.1% chr3 + 82887274 82887471 198 browser details YourSeq 51 911 1192 2000 81.7% chr13 - 31748896 31749169 274

Note: The 2000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Clip2 CAP-GLY domain containing linker protein 2 [ Mus musculus (house mouse) ] Gene ID: 269713, updated on 12-Aug-2019

Gene summary

Official Symbol Clip2 provided by MGI Official Full Name CAP-GLY domain containing linker protein 2 provided by MGI Primary source MGI:MGI:1313136 See related Ensembl:ENSMUSG00000063146 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Clip1; Cyln2; WSCR4; wbscr4; CLIP-115; mKIAA0291; B230327O20 Expression Ubiquitous expression in whole brain E14.5 (RPKM 28.5), CNS E18 (RPKM 28.0) and 24 other tissues See more Orthologs human all

Genomic context

Location: 5 G2; 5 74.63 cM See Clip2 in Genome Data Viewer Exon count: 17

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (134489383..134553767, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (134965256..135028304, complement)

Chromosome 5 - NC_000071.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Clip2 ENSMUSG00000063146

Description CAP-GLY domain containing linker protein 2 [Source:MGI Symbol;Acc:MGI:1313136] Gene Synonyms CLIP-115, Cyln2, WSCR4 Location Chromosome 5: 134,489,383-134,552,434 reverse strand. GRCm38:CM000998.2 About this gene This gene has 3 transcripts (splice variants), 198 orthologues, 4 paralogues, is a member of 1 Ensembl protein family and is associated with 11 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Clip2-202 ENSMUST00000100647.6 4994 1047aa ENSMUSP00000098212.2 Protein coding CCDS39310 Q9Z0H8 TSL:1 GENCODE basic APPRIS P4

Clip2-201 ENSMUST00000036999.9 4893 1012aa ENSMUSP00000037431.6 Protein coding CCDS39309 Q9Z0H8 TSL:1 GENCODE basic APPRIS ALT2

Clip2-203 ENSMUST00000202408.1 491 No protein - Retained intron - - TSL:3

83.05 kb Forward strand 134.48Mb 134.50Mb 134.52Mb 134.54Mb 134.56Mb Gm42884-202 >lncRNA (Comprehensive set...

Gm42884-201 >lncRNA

Contigs AC167419.4 > < AC166938.5 Genes (Comprehensive set... < Gm42885-201TEC < Clip2-203retained intron < Syna-201protein coding

< Clip2-201protein coding

< Clip2-202protein coding

< 2700029L08Rik-201TEC

Regulatory Build

134.48Mb 134.50Mb 134.52Mb 134.54Mb 134.56Mb Reverse strand 83.05 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000100647

< Clip2-202protein coding

Reverse strand 63.05 kb

ENSMUSP00000098... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily CAP Gly-rich domain superfamily SSF90257

SMART CAP Gly-rich domain Pfam CAP Gly-rich domain PROSITE profiles CAP Gly-rich domain PROSITE patterns CAP Gly-rich domain PANTHER PTHR18916

CAP-Gly domain-containing linker protein 2 Gene3D CAP Gly-rich domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1047

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8