https://www.alphaknockout.com

Mouse Cytip Knockout Project (CRISPR/Cas9)

Objective: To create a Cytip knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Cytip (NCBI Reference Sequence: NM_139200 ; Ensembl: ENSMUSG00000026832 ) is located on Mouse 2. 8 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 8 (Transcript: ENSMUST00000028175). Exon 2~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null allele display impaired trafficking and/or cell adhesion of immune system cells. Mice homozygous for a reporter allele show normal immune cell development and function; however, mutant hematopoietic stems cells have impaired repopulating activity.

Exon 2 starts from about 16.25% of the coding region. Exon 2~5 covers 28.04% of the coding region. The size of effective KO region: ~3889 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 8

Legends Exon of mouse Cytip Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(28.9% 578) | C(19.5% 390) | T(32.6% 652) | G(19.0% 380)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(31.65% 633) | C(17.35% 347) | T(29.9% 598) | G(21.1% 422)

Note: The 2000 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 58151764 58153763 2000 browser details YourSeq 51 1422 1532 2000 94.8% chr1 - 91018657 91018802 146 browser details YourSeq 48 1503 1883 2000 96.3% chr6 + 115393099 115393514 416 browser details YourSeq 39 1444 1530 2000 97.6% chr5 - 121179815 121179935 121 browser details YourSeq 39 1409 1530 2000 95.4% chr4 - 109481272 109481411 140 browser details YourSeq 38 1437 1528 2000 86.6% chr5 - 65686376 65686499 124 browser details YourSeq 36 1442 1528 2000 97.4% chr7 + 141281408 141281526 119 browser details YourSeq 36 1445 1529 2000 97.5% chr12 + 87023070 87023156 87 browser details YourSeq 35 1503 1550 2000 79.0% chr9 - 21892455 21892496 42 browser details YourSeq 32 1503 1537 2000 87.9% chr7 - 19268280 19268312 33 browser details YourSeq 31 1503 1535 2000 97.0% chr11 - 107543959 107543991 33 browser details YourSeq 31 1498 1528 2000 100.0% chr12 + 80595712 80595742 31 browser details YourSeq 30 1504 1535 2000 96.9% chr8 - 15782970 15783001 32 browser details YourSeq 30 1503 1534 2000 96.9% chr2 + 9386776 9386807 32 browser details YourSeq 29 1511 1583 2000 94.0% chr6 - 99352050 99352124 75 browser details YourSeq 29 1498 1530 2000 94.0% chr7 + 112960484 112960516 33 browser details YourSeq 28 1503 1530 2000 100.0% chr8 - 23965270 23965297 28 browser details YourSeq 28 1503 1530 2000 100.0% chr2 - 158426830 158426857 28 browser details YourSeq 28 814 846 2000 93.8% chr2 - 147962699 147962732 34 browser details YourSeq 28 1503 1530 2000 100.0% chr19 - 31605924 31605951 28

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 58145875 58147874 2000 browser details YourSeq 24 874 901 2000 96.2% chr1 + 9441947 9441979 33 browser details YourSeq 23 1481 1503 2000 100.0% chr11 + 5400871 5400893 23 browser details YourSeq 22 712 733 2000 100.0% chr9 + 53540555 53540576 22 browser details YourSeq 22 596 621 2000 92.4% chr7 + 72126576 72126601 26

Note: The 2000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Cytip cytohesin 1 interacting protein [ Mus musculus (house mouse) ] Gene ID: 227929, updated on 12-Aug-2019

Gene summary

Official Symbol Cytip provided by MGI Official Full Name cytohesin 1 interacting protein provided by MGI Primary source MGI:MGI:2183535 See related Ensembl:ENSMUSG00000026832 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Cbp; Cybr; C80816; Pscdbp; AI462064; A130053M09Rik Expression Biased expression in thymus adult (RPKM 15.2), spleen adult (RPKM 14.5) and 6 other tissuesS ee more Orthologs human all

Genomic context

Location: 2; 2 C1.1 See Cytip in Genome Data Viewer Exon count: 10

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (58129139..58195549, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (57981550..58012533, complement)

Chromosome 2 - NC_000068.7

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Cytip ENSMUSG00000026832

Description cytohesin 1 interacting protein [Source:MGI Symbol;Acc:MGI:2183535] Gene Synonyms A130053M09Rik, Cbp, Cybr, Pscdbp Location : 58,129,137-58,195,532 reverse strand. GRCm38:CM000995.2 About this gene This gene has 8 transcripts (splice variants), 194 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 8 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Cytip-201 ENSMUST00000028175.6 6110 359aa ENSMUSP00000028175.6 Protein coding CCDS16048 Q91VY6 TSL:1 GENCODE basic APPRIS P1

Cytip-205 ENSMUST00000148764.7 941 No protein - lncRNA - - TSL:3

Cytip-203 ENSMUST00000144117.7 856 No protein - lncRNA - - TSL:2

Cytip-204 ENSMUST00000146545.7 856 No protein - lncRNA - - TSL:1

Cytip-207 ENSMUST00000151785.1 686 No protein - lncRNA - - TSL:1

Cytip-202 ENSMUST00000131443.7 641 No protein - lncRNA - - TSL:5

Cytip-206 ENSMUST00000151169.1 387 No protein - lncRNA - - TSL:2

Cytip-208 ENSMUST00000153052.1 378 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

86.40 kb Forward strand 58.12Mb 58.14Mb 58.16Mb 58.18Mb 58.20Mb Gm13546-202 >lncRNA (Comprehensive set...

Gm13546-201 >lncRNA

Contigs AL928564.8 > Genes (Comprehensive set... < Cytip-201protein coding

< Cytip-203lncRNA

< Cytip-205lncRNA

< Cytip-202lncRNA < Cytip-208lncRNA

< Cytip-204lncRNA

< Cytip-207lncRNA

< Cytip-206lncRNA

Regulatory Build

58.12Mb 58.14Mb 58.16Mb 58.18Mb 58.20Mb Reverse strand 86.40 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000028175

< Cytip-201protein coding

Reverse strand 31.36 kb

ENSMUSP00000028... Low complexity (Seg) Superfamily PDZ superfamily SMART PDZ domain Pfam PDZ domain PROSITE profiles PDZ domain PANTHER PTHR15963

PTHR15963:SF1 Gene3D 2.30.42.10 CDD cd00992

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 359

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9