https://www.alphaknockout.com

Mouse Troap Knockout Project (CRISPR/Cas9)

Objective: To create a Troap knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Troap (NCBI Reference Sequence: NM_001162506 ; Ensembl: ENSMUSG00000032783 ) is located on Mouse 15. 14 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 14 (Transcript: ENSMUST00000230054). Exon 4~10 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 4 starts from about 16.72% of the coding region. Exon 4~10 covers 35.73% of the coding region. The size of effective KO region: ~3963 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4 5 6 7 8 9 10 14

Legends Exon of mouse Troap Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1490 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 584 bp section downstream of Exon 10 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(1490bp) | A(23.69% 353) | C(24.03% 358) | T(29.66% 442) | G(22.62% 337)

Note: The 1490 bp section upstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(584bp) | A(21.75% 127) | C(25.51% 149) | T(29.62% 173) | G(23.12% 135)

Note: The 584 bp section downstream of Exon 10 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1490 1 1490 1490 100.0% chr15 + 99075799 99077288 1490 browser details YourSeq 300 417 765 1490 96.4% chr9 + 73062414 73062969 556 browser details YourSeq 285 415 766 1490 93.9% chr13 - 74516324 74516736 413 browser details YourSeq 278 421 755 1490 94.6% chr4 + 37500962 37501456 495 browser details YourSeq 264 421 758 1490 93.5% chr4 - 149490440 149491039 600 browser details YourSeq 262 415 729 1490 95.2% chr11 + 115735335 115735972 638 browser details YourSeq 257 415 717 1490 94.8% chr1 - 136702002 136702405 404 browser details YourSeq 235 415 709 1490 91.9% chr5 - 139470229 139470500 272 browser details YourSeq 219 418 724 1490 95.9% chr5 - 23730677 23731172 496 browser details YourSeq 205 417 689 1490 88.9% chr12 + 102444689 102444939 251 browser details YourSeq 201 169 594 1490 91.9% chr4 - 155079076 155079490 415 browser details YourSeq 199 418 739 1490 93.5% chr1 - 135270857 135271183 327 browser details YourSeq 198 418 906 1490 92.7% chr11 + 62673852 62674424 573 browser details YourSeq 195 415 957 1490 88.8% chr16 - 94229979 94230407 429 browser details YourSeq 195 182 613 1490 97.2% chrX + 48595519 48596139 621 browser details YourSeq 192 418 956 1490 86.5% chr4 + 149190446 149190854 409 browser details YourSeq 191 158 594 1490 94.5% chr8 + 111986386 111986865 480 browser details YourSeq 190 421 951 1490 87.3% chr19 + 18724733 18725108 376 browser details YourSeq 188 418 756 1490 94.8% chr11 - 97111417 97112052 636 browser details YourSeq 188 415 691 1490 96.6% chr7 + 30128911 30129560 650

Note: The 1490 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 584 1 584 584 100.0% chr15 + 99081252 99081835 584 browser details YourSeq 151 181 343 584 97.6% chr17 - 24683614 25031257 347644 browser details YourSeq 140 181 343 584 91.6% chr10 - 110524763 110524918 156 browser details YourSeq 136 186 340 584 91.4% chr3 + 95788783 95788933 151 browser details YourSeq 135 195 343 584 96.6% chr11 + 12436689 12437016 328 browser details YourSeq 133 187 350 584 91.6% chr11 - 74196933 74197094 162 browser details YourSeq 131 178 343 584 93.5% chr2 + 28504011 28634549 130539 browser details YourSeq 130 144 341 584 90.6% chr14 - 54655119 54655467 349 browser details YourSeq 130 190 340 584 90.6% chr13 + 41525034 41525171 138 browser details YourSeq 130 190 343 584 90.0% chr11 + 106066871 106067020 150 browser details YourSeq 129 181 318 584 97.1% chr2 - 163092903 163093054 152 browser details YourSeq 128 193 339 584 91.5% chr5 - 150837706 150837847 142 browser details YourSeq 128 204 344 584 93.4% chr5 - 123893835 123893971 137 browser details YourSeq 128 195 346 584 94.5% chr18 - 58853070 58853300 231 browser details YourSeq 128 192 340 584 90.2% chr15 + 8491735 8491877 143 browser details YourSeq 127 195 343 584 95.1% chr12 - 55243401 55243552 152 browser details YourSeq 127 168 341 584 95.1% chr2 + 25078240 25078441 202 browser details YourSeq 126 195 343 584 89.6% chrX - 20750551 20750694 144 browser details YourSeq 126 188 343 584 90.9% chr5 - 105560269 105560425 157 browser details YourSeq 126 197 341 584 90.8% chr18 - 36843637 36843777 141

Note: The 584 bp section downstream of Exon 10 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Troap trophinin associated protein [ Mus musculus (house mouse) ] Gene ID: 78733, updated on 12-Aug-2019

Gene summary

Official Symbol Troap provided by MGI Official Full Name trophinin associated protein provided by MGI Primary source MGI:MGI:1925983 See related Ensembl:ENSMUSG00000032783 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as tastin; AW476063; E130301L11Rik Expression Broad expression in limb E14.5 (RPKM 9.0), CNS E11.5 (RPKM 8.9) and 20 other tissues See more Orthologs human all

Genomic context

Location: 15; 15 F1 See Troap in Genome Data Viewer Exon count: 15

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 15 NC_000081.6 (99074973..99083409)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 15 NC_000081.5 (98905404..98913840)

Chromosome 15 - NC_000081.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Troap ENSMUSG00000032783

Description trophinin associated protein [Source:MGI Symbol;Acc:MGI:1925983] Gene Synonyms E130301L11Rik, tastin Location Chromosome 15: 99,074,575-99,083,409 forward strand. GRCm38:CM001008.2 About this gene This gene has 6 transcripts (splice variants), 92 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Troap-204 ENSMUST00000230054.1 2639 668aa ENSMUSP00000155404.1 Protein coding CCDS57008 B7ZNG4 GENCODE basic APPRIS P1

Troap-201 ENSMUST00000039665.7 2255 668aa ENSMUSP00000035389.6 Protein coding CCDS57008 B7ZNG4 TSL:1 GENCODE basic APPRIS P1

Troap-203 ENSMUST00000229740.1 865 No protein - Retained intron - - -

Troap-205 ENSMUST00000230311.1 819 No protein - Retained intron - - -

Troap-206 ENSMUST00000230868.1 714 No protein - Retained intron - - -

Troap-202 ENSMUST00000229224.1 451 No protein - Retained intron - - -

28.84 kb Forward strand 99.07Mb 99.08Mb 99.09Mb (Comprehensive set... Troap-204 >protein coding Dnajc22-202 >lncRNA

Troap-201 >protein coding

Troap-206 >retained intron Mir6960-201 >miRNA

Troap-205 >retained intron Troap-203 >retained intron

Troap-202 >retained intron

Contigs < AC157610.2

Genes < Gm34284-201lncRNA < C1ql4-201protein coding (Comprehensive set...

Regulatory Build

99.07Mb 99.08Mb 99.09Mb Reverse strand 28.84 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000230054

8.84 kb Forward strand

Troap-204 >protein coding

ENSMUSP00000155... MobiDB lite Low complexity (Seg) PANTHER Tastin

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe insertion inframe deletion missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 668

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8