https://www.alphaknockout.com

Mouse Tm9sf1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tm9sf1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tm9sf1 (NCBI Reference Sequence: NM_028780 ; Ensembl: ENSMUSG00000002320 ) is located on Mouse 14. 6 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 6 (Transcript: ENSMUST00000122358). Exon 3~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tm9sf1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-369O11 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 19.03% of the coding region. The knockout of Exon 3~4 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 1005 bp, and the size of intron 4 for 3'-loxP site insertion: 2252 bp. The size of effective cKO region: ~1704 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 6 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Tm9sf1 cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8204bp) | A(24.15% 1981) | C(24.09% 1976) | T(26.52% 2176) | G(25.24% 2071)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 - 55641839 55644838 3000 browser details YourSeq 177 588 819 3000 89.0% chr14 - 122305933 122306132 200 browser details YourSeq 177 617 822 3000 92.4% chr6 + 41357053 41357253 201 browser details YourSeq 176 625 822 3000 93.4% chr4 - 137997509 137997704 196 browser details YourSeq 176 630 818 3000 96.9% chr6 + 24161519 24161710 192 browser details YourSeq 176 617 822 3000 94.5% chr13 + 55585009 55585628 620 browser details YourSeq 175 635 822 3000 96.8% chr12 - 54882110 54882300 191 browser details YourSeq 174 621 822 3000 91.2% chr11 - 107125539 107125732 194 browser details YourSeq 173 635 857 3000 95.8% chr4 + 108306178 108306401 224 browser details YourSeq 172 635 822 3000 94.7% chr8 + 84849983 84850169 187 browser details YourSeq 172 635 822 3000 94.7% chr17 + 26922010 26922196 187 browser details YourSeq 172 635 822 3000 94.7% chr10 + 128324852 128325038 187 browser details YourSeq 171 617 822 3000 93.9% chr4 - 144044324 144044530 207 browser details YourSeq 171 635 822 3000 95.8% chr13 - 50448793 50448982 190 browser details YourSeq 171 635 822 3000 95.8% chr11 - 88164086 88164274 189 browser details YourSeq 171 634 823 3000 95.3% chr13 + 8874806 8874995 190 browser details YourSeq 170 635 822 3000 94.7% chr7 - 116113989 116114175 187 browser details YourSeq 170 632 822 3000 92.5% chr7 - 64264003 64264188 186 browser details YourSeq 170 635 820 3000 94.6% chr4 - 136039499 136039683 185 browser details YourSeq 170 635 822 3000 94.2% chr2 - 29936544 29936730 187

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 - 55637135 55640134 3000 browser details YourSeq 285 1639 2630 3000 92.6% chr2 - 152970475 153159165 188691 browser details YourSeq 197 1505 2595 3000 95.1% chr1 - 181750731 181951707 200977 browser details YourSeq 186 2463 2721 3000 88.1% chr2 - 144391309 144391910 602 browser details YourSeq 185 2450 2703 3000 90.1% chr4 - 126209916 126210494 579 browser details YourSeq 185 2442 2686 3000 91.9% chr11 + 79926481 79927017 537 browser details YourSeq 168 2446 2693 3000 90.4% chr17 + 26144962 26145627 666 browser details YourSeq 165 2442 2675 3000 91.0% chr17 + 45620872 45621205 334 browser details YourSeq 164 2448 2727 3000 91.8% chr18 + 67801216 67801631 416 browser details YourSeq 164 2463 2692 3000 91.0% chr11 + 97881234 97881549 316 browser details YourSeq 161 2438 2665 3000 92.2% chr4 + 134323948 134324182 235 browser details YourSeq 157 2447 2712 3000 93.0% chr17 - 35511595 35512167 573 browser details YourSeq 155 2452 2722 3000 83.6% chrX - 12807926 12808140 215 browser details YourSeq 155 2446 2630 3000 90.7% chr15 - 80230065 80230247 183 browser details YourSeq 154 2450 2630 3000 93.8% chr13 - 3306945 3307125 181 browser details YourSeq 154 2443 2630 3000 92.9% chr1 - 63234085 63234634 550 browser details YourSeq 151 2446 2630 3000 93.2% chr2 + 92162438 92281851 119414 browser details YourSeq 151 2448 2630 3000 92.2% chr11 + 61625971 61626155 185 browser details YourSeq 149 2446 2630 3000 89.2% chr14 - 30537623 30537806 184 browser details YourSeq 148 2446 2630 3000 91.7% chr3 - 58551064 58551252 189

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Tm9sf1 transmembrane 9 superfamily member 1 [ Mus musculus (house mouse) ] Gene ID: 74140, updated on 24-Oct-2019

Gene summary

Official Symbol Tm9sf1 provided by MGI Official Full Name transmembrane 9 superfamily member 1 provided by MGI Primary source MGI:MGI:1921390 See related Ensembl:ENSMUSG00000002320 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as MP70; AI893436; 1200014D02Rik Expression Ubiquitous expression in colon adult (RPKM 42.3), adrenal adult (RPKM 39.5) and 28 other tissues See more Orthologs human all

Genomic context

Location: 14; 14 C3 See Tm9sf1 in Genome Data Viewer

Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 14 NC_000080.6 (55635965..55643806, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 14 NC_000080.5 (56254803..56262643, complement)

Chromosome 14 - NC_000080.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 13 transcripts

Gene: Tm9sf1 ENSMUSG00000002320

Description transmembrane 9 superfamily member 1 [Source:MGI Symbol;Acc:MGI:1921390] Gene Synonyms 1200014D02Rik, MP70 Location : 55,635,965-55,643,806 reverse strand. GRCm38:CM001007.2 About this gene This gene has 13 transcripts (splice variants), 168 orthologues, 5 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tm9sf1- ENSMUST00000122358.7 2517 606aa ENSMUSP00000113782.1 Protein coding CCDS27122 Q9DBU0 TSL:1 205 GENCODE basic APPRIS P1

Tm9sf1- ENSMUST00000002391.14 2382 606aa ENSMUSP00000002391.8 Protein coding CCDS27122 Q9DBU0 TSL:1 201 GENCODE basic APPRIS P1

Tm9sf1- ENSMUST00000120041.7 2164 606aa ENSMUSP00000112893.1 Protein coding CCDS27122 Q9DBU0 TSL:5 202 GENCODE basic APPRIS P1

Tm9sf1- ENSMUST00000121791.7 2148 606aa ENSMUSP00000112764.1 Protein coding CCDS27122 Q9DBU0 TSL:5 203 GENCODE basic APPRIS P1

Tm9sf1- ENSMUST00000121937.7 2398 589aa ENSMUSP00000113143.1 Protein coding - D3Z6X7 TSL:5 204 GENCODE basic

Tm9sf1- ENSMUST00000132338.7 1742 484aa ENSMUSP00000118427.1 Protein coding - D3YWH4 CDS 3' 208 incomplete TSL:5

Tm9sf1- ENSMUST00000138085.1 708 172aa ENSMUSP00000119435.1 Protein coding - D3Z279 CDS 3' 210 incomplete TSL:3

Tm9sf1- ENSMUST00000133707.1 576 118aa ENSMUSP00000123471.1 Protein coding - D3YYW1 CDS 3' 209 incomplete TSL:5

Tm9sf1- ENSMUST00000149726.7 2089 417aa ENSMUSP00000115403.1 Nonsense mediated - D6RGM8 TSL:5 213 decay

Tm9sf1- ENSMUST00000127473.1 2271 No - Retained intron - - TSL:1 206 protein

Tm9sf1- ENSMUST00000146588.1 760 No - Retained intron - - TSL:2 212 protein

Tm9sf1- ENSMUST00000139313.1 840 No - lncRNA - - TSL:3 211 protein

Tm9sf1- ENSMUST00000130167.7 485 No - lncRNA - - TSL:3 207 protein

Page 6 of 8 https://www.alphaknockout.com

27.84 kb Forward strand 55.63Mb 55.64Mb 55.65Mb Gm49747-201 >lncRNA Tssk4-205 >nonsense mediated decay (Comprehensive set...

Tssk4-204 >protein coding

Tssk4-207 >protein coding

Tssk4-202 >protein coding

Tssk4-201 >protein coding

Tssk4-206 >protein coding

Tssk4-203 >protein coding

Contigs < AC174678.2

Genes (Comprehensive set... < Ipo4-201protein coding < Tm9sf1-205protein coding

< Ipo4-209nonsense mediated decay < Tm9sf1-202protein coding

< Ipo4-207retained intron < Ipo4-211retained intron < Tm9sf1-208protein coding

< Ipo4-212lncRNA< Ipo4-204retained intron < Ipo4-210retained intron < Tm9sf1-207lncRNA< Tm9sf1-210protein coding

< Ipo4-208lncRNA < Gm49378-201nonsense mediated decay < Tm9sf1-206retained intron

< Ipo4-206protein coding < Ipo4-202protein coding < Tm9sf1-211lncRNA

< Ipo4-205nonsense mediated decay < Tm9sf1-204protein coding

< Ipo4-203retained intron < Tm9sf1-203protein coding

< Ipo4-213lncRNA < Tm9sf1-213nonsense mediated decay

< Tm9sf1-201protein coding

< Tm9sf1-212retained intron

< Tm9sf1-209protein coding

Regulatory Build

55.63Mb 55.64Mb 55.65Mb Reverse strand 27.84 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000122358

< Tm9sf1-205protein coding

Reverse strand 7.84 kb

ENSMUSP00000113... Transmembrane heli... Low complexity (Seg) Cleavage site (Sign... Pfam Nonaspanin (TM9SF)

PANTHER PTHR10766:SF14

Nonaspanin (TM9SF)

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 606

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8