https://www.alphaknockout.com

Mouse Ints9 Knockout Project (CRISPR/Cas9)

Objective: To create a Ints9 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ints9 (NCBI Reference Sequence: NM_153414 ; Ensembl: ENSMUSG00000021975 ) is located on Mouse 14. 17 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 17 (Transcript: ENSMUST00000043914). Exon 2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 4.71% of the coding region. Exon 2 covers 6.21% of the coding region. The size of effective KO region: ~128 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 17

Legends Exon of mouse Ints9 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.9% 518) | C(20.8% 416) | T(31.8% 636) | G(21.5% 430)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.95% 539) | C(20.95% 419) | T(31.9% 638) | G(20.2% 404)

Note: The 2000 bp section downstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr14 + 64978123 64980122 2000 browser details YourSeq 127 939 1318 2000 85.4% chr11 - 120087846 120088113 268 browser details YourSeq 126 924 1068 2000 93.8% chr5 - 149859447 149859592 146 browser details YourSeq 121 935 1076 2000 93.7% chr5 + 136541395 136541559 165 browser details YourSeq 120 935 1064 2000 97.0% chr17 + 23181091 23181437 347 browser details YourSeq 118 935 1055 2000 99.2% chr3 - 97882008 97882130 123 browser details YourSeq 118 929 1055 2000 96.9% chr18 - 36218484 36218612 129 browser details YourSeq 117 924 1055 2000 94.7% chr11 - 76181099 76181232 134 browser details YourSeq 116 922 1060 2000 92.8% chr8 - 57519519 57519658 140 browser details YourSeq 116 935 1055 2000 98.4% chr14 - 34394859 34394981 123 browser details YourSeq 116 935 1055 2000 98.4% chr15 + 45708636 45708758 123 browser details YourSeq 115 867 1055 2000 84.9% chr16 - 30127920 30128062 143 browser details YourSeq 115 928 1055 2000 95.4% chr2 + 91558783 91558912 130 browser details YourSeq 114 928 1055 2000 96.0% chrX - 161303330 161303458 129 browser details YourSeq 114 923 1055 2000 93.9% chr5 - 121788439 121788572 134 browser details YourSeq 114 935 1055 2000 97.6% chr19 - 4179849 4179971 123 browser details YourSeq 114 929 1055 2000 96.8% chr6 + 48984064 48984192 129 browser details YourSeq 114 939 1072 2000 93.9% chr5 + 135744864 135745000 137 browser details YourSeq 114 935 1055 2000 97.6% chr18 + 53122486 53122608 123 browser details YourSeq 113 928 1055 2000 94.6% chr16 + 3343263 3343392 130

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr14 + 64980251 64982250 2000 browser details YourSeq 52 1790 1850 2000 95.0% chr2 - 163913529 164079304 165776 browser details YourSeq 43 1692 1846 2000 82.4% chr10 - 36938996 36939325 330 browser details YourSeq 37 1793 1846 2000 93.1% chrX - 160277588 160277647 60 browser details YourSeq 37 1781 1831 2000 86.3% chr14 - 20344761 20344811 51 browser details YourSeq 37 1788 1833 2000 91.4% chr13 + 11915084 11915132 49 browser details YourSeq 36 1796 1855 2000 80.0% chr2 + 120424774 120424833 60 browser details YourSeq 35 1801 1855 2000 81.9% chr6 - 148783123 148783177 55 browser details YourSeq 35 1796 1852 2000 80.8% chr12 + 10427420 10427476 57 browser details YourSeq 35 426 474 2000 73.0% chr11 + 46624026 46624062 37 browser details YourSeq 35 425 478 2000 88.9% chr10 + 71894611 71894752 142 browser details YourSeq 34 1799 1852 2000 81.5% chr17 - 34127066 34127119 54 browser details YourSeq 34 423 457 2000 100.0% chr10 - 70280586 70280622 37 browser details YourSeq 34 1797 1852 2000 76.4% chr13 + 73967407 73967461 55 browser details YourSeq 34 1799 1850 2000 82.7% chr13 + 38070876 38070927 52 browser details YourSeq 33 1795 1844 2000 82.3% chr15 + 78648606 78648654 49 browser details YourSeq 32 1798 1836 2000 97.1% chr8 + 117501667 117501706 40 browser details YourSeq 32 1797 1850 2000 97.1% chr4 + 135936239 135936293 55 browser details YourSeq 32 1783 1833 2000 86.4% chr3 + 89293966 89294026 61 browser details YourSeq 32 1796 1833 2000 92.2% chr2 + 72830119 72830156 38

Note: The 2000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Ints9 integrator complex subunit 9 [ Mus musculus (house mouse) ] Gene ID: 210925, updated on 12-Aug-2019

Gene summary

Official Symbol Ints9 provided by MGI Official Full Name integrator complex subunit 9 provided by MGI Primary source MGI:MGI:1098533 See related Ensembl:ENSMUSG00000021975 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as BC028953; D14Ertd231e Expression Ubiquitous expression in thymus adult (RPKM 12.8), limb E14.5 (RPKM 10.8) and 28 other tissues See more Orthologs all

Genomic context

Location: 14 D1; 14 33.81 cM See Ints9 in Genome Data Viewer Exon count: 17

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 14 NC_000080.6 (64950045..65039835)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 14 NC_000080.5 (65568882..65658663)

Chromosome 14 - NC_000080.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Ints9 ENSMUSG00000021975

Description integrator complex subunit 9 [Source:MGI Symbol;Acc:MGI:1098533] Gene Synonyms D14Ertd231e Location Chromosome 14: 64,950,045-65,039,832 forward strand. GRCm38:CM001007.2 About this gene This gene has 3 transcripts (splice variants), 201 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ints9-201 ENSMUST00000043914.6 2674 687aa ENSMUSP00000045552.5 Protein coding CCDS27210 A0A0R4J0J5 TSL:1 GENCODE basic APPRIS P1

Ints9-202 ENSMUST00000224593.1 3476 No protein - Retained intron - - -

Ints9-203 ENSMUST00000225790.1 3074 No protein - Retained intron - - -

Page 7 of 9 https://www.alphaknockout.com

109.79 kb Forward strand 64.96Mb 64.98Mb 65.00Mb 65.02Mb 65.04Mb (Comprehensive set... Ints9-201 >protein coding

Ints9-202 >retained intron

Ints9-203 >retained intron

Contigs < AC141425.3 AC155172.4 >

Genes < Hmbox1-201protein coding (Comprehensive set...

< Hmbox1-202protein coding

< Hmbox1-205protein coding

< Hmbox1-209protein coding

< Hmbox1-203protein coding

< Hmbox1-207protein coding

< Hmbox1-204protein coding

< Hmbox1-208retained intron

Regulatory Build

64.96Mb 64.98Mb 65.00Mb 65.02Mb 65.04Mb Reverse strand 109.79 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000043914

89.79 kb Forward strand

Ints9-201 >protein coding

ENSMUSP00000045... MobiDB lite Low complexity (Seg) Superfamily Ribonuclease Z/Hydroxyacylglutathione hydrolase-like SMART Beta-Casp domain Pfam Metallo-beta-lactamase Beta-Casp domain

PANTHER Integrator complex subunit 9 Gene3D 3.40.50.10890 CDD cd16294

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 687

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9