https://www.alphaknockout.com

Mouse Trank1 Knockout Project (CRISPR/Cas9)

Objective: To create a Trank1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Trank1 (NCBI Reference Sequence: NM_001164659 ; Ensembl: ENSMUSG00000062296 ) is located on Mouse 9. 22 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 22 (Transcript: ENSMUST00000078626). Exon 2~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 4.57% of the coding region. Exon 2~3 covers 3.09% of the coding region. The size of effective KO region: ~9548 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 22

Legends Exon of mouse Trank1 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.6% 532) | C(19.4% 388) | T(31.7% 634) | G(22.3% 446)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.85% 497) | C(20.55% 411) | T(32.3% 646) | G(22.3% 446)

Note: The 2000 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr9 + 111331699 111333698 2000 browser details YourSeq 21 1963 1983 2000 100.0% chr8 + 119085586 119085606 21

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr9 + 111343247 111345246 2000 browser details YourSeq 70 1268 1382 2000 83.6% chr4 + 123064889 123064993 105 browser details YourSeq 68 1241 1380 2000 73.1% chr6 - 147004624 147004753 130 browser details YourSeq 55 1274 1381 2000 75.4% chr10 + 69379398 69379492 95 browser details YourSeq 50 1110 1295 2000 70.2% chr2 - 135174519 135174621 103 browser details YourSeq 50 1281 1384 2000 74.7% chr7 + 72995770 72995862 93 browser details YourSeq 50 1281 1384 2000 77.3% chr6 + 111165465 111165557 93 browser details YourSeq 45 1217 1291 2000 76.3% chr19 - 40266955 40267020 66 browser details YourSeq 44 1325 1395 2000 74.3% chr2 - 53083313 53083378 66 browser details YourSeq 44 1281 1384 2000 80.0% chr5 + 122426040 122426132 93 browser details YourSeq 40 1241 1292 2000 88.5% chr3 - 55065576 55065627 52 browser details YourSeq 40 1325 1382 2000 80.8% chr17 + 74355895 74355951 57 browser details YourSeq 39 1241 1304 2000 78.0% chr5 - 90354642 90354703 62 browser details YourSeq 38 1281 1385 2000 95.3% chr15 - 34451294 34451571 278 browser details YourSeq 38 1249 1304 2000 84.0% chr11 - 75579601 75579656 56 browser details YourSeq 38 1335 1384 2000 88.0% chr10 - 17716260 17716309 50 browser details YourSeq 37 1241 1300 2000 84.5% chr6 + 40029051 40029108 58 browser details YourSeq 35 1241 1291 2000 80.0% chr6 - 5781578 5781627 50 browser details YourSeq 35 1241 1285 2000 88.9% chr5 - 132855463 132855507 45 browser details YourSeq 35 1329 1389 2000 70.4% chr18 - 11468835 11468888 54

Note: The 2000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Trank1 tetratricopeptide repeat and ankyrin repeat containing 1 [ Mus musculus (house mouse) ] Gene ID: 320429, updated on 12-Aug-2019

Gene summary

Official Symbol Trank1 provided by MGI Official Full Name tetratricopeptide repeat and ankyrin repeat containing 1 provided by MGI Primary source MGI:MGI:1341834 See related Ensembl:ENSMUSG00000062296 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Lba1; Gm187; A230061D21Rik; C030048J01Rik Expression Biased expression in frontal lobe adult (RPKM 42.3), cortex adult (RPKM 41.4) and 7 other tissues See more Orthologs human all

Genomic context

Location: 9; 9 F3 See Trank1 in Genome Data Viewer Exon count: 25

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 9 NC_000075.6 (111311283..111395775)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 9 NC_000075.5 (111214243..111298279)

Chromosome 9 - NC_000075.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Trank1 ENSMUSG00000062296

Description tetratricopeptide repeat and ankyrin repeat containing 1 [Source:MGI Symbol;Acc:MGI:1341834] Gene Synonyms A230061D21Rik, C030048J01Rik, LOC235639, Lba1 Location Chromosome 9: 111,311,739-111,395,775 forward strand. GRCm38:CM001002.2 About this gene This gene has 7 transcripts (splice variants), 131 orthologues, 2 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Trank1-201 ENSMUST00000078626.7 10520 2999aa ENSMUSP00000077697.3 Protein coding CCDS52942 Q8BV79 TSL:5 GENCODE basic APPRIS P1

Trank1-203 ENSMUST00000197049.1 560 142aa ENSMUSP00000143534.1 Protein coding - A0A0G2JGE4 CDS 3' incomplete TSL:2

Trank1-204 ENSMUST00000197650.1 618 No protein - Retained intron - - TSL:2

Trank1-202 ENSMUST00000196945.4 5546 No protein - lncRNA - - TSL:1

Trank1-206 ENSMUST00000200151.1 3356 No protein - lncRNA - - TSL:1

Trank1-205 ENSMUST00000198890.4 835 No protein - lncRNA - - TSL:3

Trank1-207 ENSMUST00000200272.1 750 No protein - lncRNA - - TSL:3

104.04 kb Forward strand 111.32Mb 111.34Mb 111.36Mb 111.38Mb 111.40Mb (Comprehensive set... Trank1-201 >protein coding

Trank1-206 >lncRNA Trank1-202 >lncRNA

Trank1-203 >protein coding Trank1-205 >lncRNA

Trank1-207 >lncRNA

Trank1-204 >retained intron

Contigs < AC126677.3 AC113495.15 > Regulatory Build

111.32Mb 111.34Mb 111.36Mb 111.38Mb 111.40Mb Reverse strand 104.04 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000078626

84.04 kb Forward strand

Trank1-201 >protein coding

ENSMUSP00000077... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily P-loop containing nucleoside triphosphate hydrolase

Ankyrin repeat-containing domain superfamily

Tetratricopeptide-like helical domain superfamily SMART Ankyrin repeat PROSITE profiles Tetratricopeptide repeat-containing domain

Ankyrin repeat-containing domain PANTHER PTHR21529:SF4

TPR and ankyrin repeat-containing protein 1 Gene3D Ankyrin repeat-containing domain superfamily 3.40.50.300

Tetratricopeptide-like helical domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2400 2999

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8