https://www.alphaknockout.com

Mouse Tom1l2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Tom1l2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tom1l2 (NCBI Reference Sequence: NM_153080 ; Ensembl: ENSMUSG00000000538 ) is located on Mouse 11. 15 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 15 (Transcript: ENSMUST00000102683). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tom1l2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-456O18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a hypomorphic gene trap allele show malocclusion, kyphosis, hydrocephaly, patchy hair, splenomegaly, high B- and T-cell counts, thrombopenia, impaired humoral responses, a high frequency of infections and tumors, renal cysts, skin lesions, freezing behavior and sporadic bleeding.

Exon 3 starts from about 9.07% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 5196 bp, and the size of intron 3 for 3'-loxP site insertion: 4456 bp. The size of effective cKO region: ~579 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 15 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Tom1l2 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7079bp) | A(25.09% 1776) | C(23.35% 1653) | T(29.28% 2073) | G(22.28% 1577)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 60275253 60278252 3000 browser details YourSeq 76 1923 2015 3000 92.2% chr13 + 12639644 12639768 125 browser details YourSeq 76 1923 2015 3000 92.2% chr13 + 13382910 13383034 125 browser details YourSeq 74 94 186 3000 94.2% chr1 - 25468306 25468431 126 browser details YourSeq 70 1923 2008 3000 91.9% chr17 - 36140172 36140292 121 browser details YourSeq 69 1916 2008 3000 91.7% chr11 - 3259946 3260084 139 browser details YourSeq 68 1923 2008 3000 90.7% chr2 - 180844861 180844981 121 browser details YourSeq 67 1928 2020 3000 91.4% chr17 - 25316894 25317018 125 browser details YourSeq 66 1928 2004 3000 93.6% chrX + 132011105 132011216 112 browser details YourSeq 66 1922 2008 3000 92.5% chr12 + 110971547 110971670 124 browser details YourSeq 64 1925 2005 3000 93.3% chrX + 134227906 134228023 118 browser details YourSeq 64 1917 2001 3000 92.3% chr11 + 51893988 51894387 400 browser details YourSeq 61 1923 1996 3000 91.8% chr1 - 173764758 173764866 109 browser details YourSeq 61 1924 2020 3000 91.8% chr17 + 25682843 25682976 134 browser details YourSeq 60 1923 2005 3000 86.6% chr1 + 20985689 20985806 118 browser details YourSeq 58 2556 2655 3000 81.7% chr12 - 12924316 12924414 99 browser details YourSeq 58 1923 1989 3000 94.1% chr11 - 89956435 89956507 73 browser details YourSeq 58 95 171 3000 86.2% chr10 - 62179933 62180008 76 browser details YourSeq 58 1917 2005 3000 92.7% chr15 + 58105138 58105268 131 browser details YourSeq 57 579 673 3000 91.5% chr19 - 60511349 60511443 95

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 60271674 60274673 3000 browser details YourSeq 237 1914 2228 3000 92.2% chr13 + 100619180 100626906 7727 browser details YourSeq 216 1874 2228 3000 87.5% chr2 + 119181728 119182126 399 browser details YourSeq 195 1901 2223 3000 91.2% chr17 - 35180909 35181536 628 browser details YourSeq 194 1947 2233 3000 88.7% chr9 - 110194550 110194893 344 browser details YourSeq 170 1770 2240 3000 93.9% chr11 - 99251160 99251658 499 browser details YourSeq 168 1786 2092 3000 88.3% chr4 + 45218193 45218560 368 browser details YourSeq 165 1990 2228 3000 90.0% chr11 - 85982412 85982703 292 browser details YourSeq 152 2105 2323 3000 95.9% chr10 - 13522992 13523267 276 browser details YourSeq 150 2103 2323 3000 95.3% chr7 + 25101598 25101840 243 browser details YourSeq 150 1751 2225 3000 86.6% chr15 + 55073100 55073474 375 browser details YourSeq 149 1897 2139 3000 87.5% chr11 - 70697656 70698324 669 browser details YourSeq 144 2102 2323 3000 95.1% chr2 + 154415281 154415834 554 browser details YourSeq 143 1902 2114 3000 88.3% chr10 + 52182294 52182961 668 browser details YourSeq 142 1882 2084 3000 87.4% chr19 - 5709595 5709804 210 browser details YourSeq 142 1882 2071 3000 89.3% chr13 + 46874045 46874233 189 browser details YourSeq 141 1882 2070 3000 85.0% chr8 - 27209318 27209497 180 browser details YourSeq 141 1863 2069 3000 83.8% chr14 - 62324454 62324647 194 browser details YourSeq 140 1882 2071 3000 89.3% chr16 + 18295520 18295868 349 browser details YourSeq 139 2032 2228 3000 92.7% chr1 - 133483653 133701182 217530

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Tom1l2 target of myb1-like 2 (chicken) [ Mus musculus (house mouse) ] Gene ID: 216810, updated on 24-Oct-2019

Gene summary

Official Symbol Tom1l2 provided by MGI Official Full Name target of myb1-like 2 (chicken) provided by MGI Primary source MGI:MGI:2443306 See related Ensembl:ENSMUSG00000000538 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Srebf1; AU042072; 2900016I08Rik; A730055F12Rik Expression Ubiquitous expression in adrenal adult (RPKM 35.6), cortex adult (RPKM 28.0) and 26 other tissues See more Orthologs human all

Genomic context

Location: 11; 11 B2 See Tom1l2 in Genome Data Viewer

Exon count: 21

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (60226714..60352932, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (60040216..60166407, complement)

Chromosome 11 - NC_000077.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 11 transcripts

Gene: Tom1l2 ENSMUSG00000000538

Description target of myb1-like 2 (chicken) [Source:MGI Symbol;Acc:MGI:2443306] Gene Synonyms 2900016I08Rik, A730055F12Rik, myb1-like protein 2 Location Chromosome 11: 60,226,714-60,352,905 reverse strand. GRCm38:CM001004.2 About this gene This gene has 11 transcripts (splice variants), 274 orthologues, 10 paralogues, is a member of 1 Ensembl protein family and is associated with 40 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tom1l2-206 ENSMUST00000102683.10 4999 507aa ENSMUSP00000099744.4 Protein coding CCDS24786 Q5SRX1 TSL:1 GENCODE basic APPRIS P3

Tom1l2-204 ENSMUST00000095254.11 4940 487aa ENSMUSP00000092884.5 Protein coding CCDS36171 Q5SRX1 TSL:1 GENCODE basic APPRIS ALT1

Tom1l2-205 ENSMUST00000102682.4 2256 440aa ENSMUSP00000099743.4 Protein coding CCDS24787 Q5SRX1 TSL:1 GENCODE basic

Tom1l2-203 ENSMUST00000093048.12 4864 462aa ENSMUSP00000090736.6 Protein coding - Q5SXA4 TSL:5 GENCODE basic

Tom1l2-202 ENSMUST00000093046.12 4849 457aa ENSMUSP00000090734.6 Protein coding - Q5SXA5 TSL:5 GENCODE basic

Tom1l2-201 ENSMUST00000064019.14 2151 450aa ENSMUSP00000063414.8 Protein coding - Q5SRX1 TSL:1 GENCODE basic

Tom1l2-207 ENSMUST00000133420.7 790 199aa ENSMUSP00000117623.1 Protein coding - F6ZDJ1 CDS 5' incomplete TSL:3

Tom1l2-209 ENSMUST00000143124.7 729 179aa ENSMUSP00000121936.1 Protein coding - F6RBX1 CDS 5' incomplete TSL:3

Tom1l2-211 ENSMUST00000153920.1 3791 No protein - lncRNA - - TSL:1

Tom1l2-208 ENSMUST00000142225.1 3013 No protein - lncRNA - - TSL:1

Tom1l2-210 ENSMUST00000151284.1 886 No protein - lncRNA - - TSL:2

Page 6 of 8 https://www.alphaknockout.com

146.19 kb Forward strand 60.22Mb 60.24Mb 60.26Mb 60.28Mb 60.30Mb 60.32Mb 60.34Mb 60.36Mb Gm12265-201 >lncRNA Gm12266-201 >processed pseudogene (Comprehensive set...

Drc3-204 >protein coding

Drc3-202 >protein coding

Drc3-201 >protein coding

Drc3-203 >protein coding

Drc3-205 >lncRNA

Contigs AL669954.6 > AL596090.11 > Genes < Srebf1-201protein coding < Tom1l2-201protein coding (Comprehensive set...

< Srebf1-204lncRNA < Tom1l2-205protein coding

< Tom1l2-204protein coding

< Tom1l2-202protein coding

< Tom1l2-203protein coding

< Tom1l2-206protein coding

< Tom1l2-211lncRNA < Tom1l2-208lncRNA

< Tom1l2-210lncRNA < Gm27711-201miRNA

< Tom1l2-209protein coding

< Tom1l2-207protein coding

Regulatory Build

60.22Mb 60.24Mb 60.26Mb 60.28Mb 60.30Mb 60.32Mb 60.34Mb 60.36Mb Reverse strand 146.19 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000102683

< Tom1l2-206protein coding

Reverse strand 126.19 kb

ENSMUSP00000099... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF89009

ENTH/VHS SMART VHS domain Pfam VHS domain GAT domain

PROSITE profiles VHS domain GAT domain

PIRSF Target of Myb protein 1 PANTHER PTHR13856

Target of Myb1-like 2 Gene3D ENTH/VHS GAT domain superfamily

CDD cd16996 cd14238

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 507

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8