https://www.alphaknockout.com

Mouse Tmx1 Knockout Project (CRISPR/Cas9)

Objective: To create a Tmx1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Tmx1 (NCBI Reference Sequence: NM_028339 ; Ensembl: ENSMUSG00000021072 ) is located on Mouse 12. 8 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 8 (Transcript: ENSMUST00000021471). Exon 2~7 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: No notable phenotype was detected in a high-throughput screen of homozygous mice.

Exon 2 starts from about 18.35% of the coding region. Exon 2~7 covers 60.67% of the coding region. The size of effective KO region: ~8455 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8

Legends Exon of mouse Tmx1 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1511 bp section downstream of Exon 7 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.35% 547) | C(18.4% 368) | T(32.85% 657) | G(21.4% 428)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1511bp) | A(26.87% 406) | C(17.34% 262) | T(34.08% 515) | G(21.71% 328)

Note: The 1511 bp section downstream of Exon 7 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 + 70454013 70456012 2000 browser details YourSeq 39 856 906 2000 81.3% chrX - 21179805 21179852 48 browser details YourSeq 34 850 900 2000 78.3% chr1 - 24100932 24100979 48 browser details YourSeq 33 1406 1701 2000 92.4% chr1 - 16154834 16155292 459 browser details YourSeq 30 878 910 2000 96.9% chr1 + 45555005 45555057 53 browser details YourSeq 29 878 907 2000 100.0% chr11 + 10795329 10795362 34 browser details YourSeq 29 877 906 2000 100.0% chr1 + 22347230 22347267 38 browser details YourSeq 28 882 918 2000 94.0% chr19 - 24992936 24992984 49 browser details YourSeq 28 878 907 2000 89.7% chrX + 86120186 86120214 29 browser details YourSeq 28 878 906 2000 100.0% chr8 + 58738246 58738277 32 browser details YourSeq 27 878 907 2000 96.6% chr5 - 110326133 110326166 34 browser details YourSeq 27 1843 1879 2000 96.6% chr7 + 34908224 34908261 38 browser details YourSeq 27 878 906 2000 96.6% chr2 + 106339906 106339934 29 browser details YourSeq 27 878 904 2000 100.0% chr13 + 110152779 110152805 27 browser details YourSeq 27 878 906 2000 96.6% chr11 + 23906931 23906959 29 browser details YourSeq 27 880 907 2000 100.0% chr10 + 100483746 100483775 30 browser details YourSeq 27 880 910 2000 86.7% chr1 + 40043709 40043738 30 browser details YourSeq 26 1843 1879 2000 96.5% chr9 - 48680953 48680991 39 browser details YourSeq 26 880 908 2000 96.5% chr1 - 5386148 5386176 29 browser details YourSeq 26 1597 1630 2000 83.9% chr12 + 5671199 5671231 33

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1511 1 1511 1511 100.0% chr12 + 70464468 70465978 1511 browser details YourSeq 29 53 95 1511 81.3% chr11 - 74478685 74478724 40 browser details YourSeq 29 49 88 1511 79.0% chr4 + 124239274 124239311 38 browser details YourSeq 29 55 86 1511 96.9% chr16 + 10499241 10499274 34 browser details YourSeq 29 56 87 1511 96.9% chr10 + 13566405 13566438 34 browser details YourSeq 29 51 87 1511 94.0% chr10 + 3175213 3175251 39 browser details YourSeq 28 51 88 1511 96.7% chr11 - 7394498 7394537 40 browser details YourSeq 27 53 84 1511 93.6% chr3 - 69343664 69343697 34 browser details YourSeq 27 57 87 1511 96.6% chr19 - 23336049 23336081 33 browser details YourSeq 27 55 87 1511 96.6% chr17 - 24451968 24452002 35 browser details YourSeq 27 54 85 1511 96.6% chr11 - 71960825 71960858 34 browser details YourSeq 27 54 83 1511 96.6% chrX + 42897161 42897192 32 browser details YourSeq 26 1173 1201 1511 85.2% chrX - 9206764 9206790 27 browser details YourSeq 26 52 82 1511 96.5% chr15 - 12110834 12110866 33 browser details YourSeq 26 56 82 1511 100.0% chr7 + 140875919 140875947 29 browser details YourSeq 26 53 87 1511 96.5% chr1 + 150197455 150197491 37 browser details YourSeq 25 59 88 1511 93.2% chr13 + 44015315 44015346 32 browser details YourSeq 24 54 83 1511 96.2% chr14 + 57412071 57412102 32 browser details YourSeq 24 55 85 1511 96.2% chr13 + 43248059 43248091 33 browser details YourSeq 22 1181 1203 1511 100.0% chr15 - 35286854 35286877 24

Note: The 1511 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Tmx1 thioredoxin-related transmembrane protein 1 [ Mus musculus (house mouse) ] Gene ID: 72736, updated on 12-Aug-2019

Gene summary

Official Symbol Tmx1 provided by MGI Official Full Name thioredoxin-related transmembrane protein 1 provided by MGI Primary source MGI:MGI:1919986 See related Ensembl:ENSMUSG00000021072 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Txndc1; 2810425A04Rik Expression Broad expression in liver E14 (RPKM 28.3), placenta adult (RPKM 26.2) and 25 other tissues See more Orthologs human all

Genomic context

Location: 12; 12 C2 See Tmx1 in Genome Data Viewer Exon count: 8

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 12 NC_000078.6 (70453154..70467624)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 12 NC_000078.5 (71554141..71568611)

Chromosome 12 - NC_000078.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Tmx1 ENSMUSG00000021072

Description thioredoxin-related transmembrane protein 1 [Source:MGI Symbol;Acc:MGI:1919986] Gene Synonyms 2810425A04Rik, Txndc1 Location Chromosome 12: 70,453,095-70,468,040 forward strand. GRCm38:CM001005.2 About this gene This gene has 3 transcripts (splice variants), 214 orthologues, 13 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Tmx1- ENSMUST00000021471.12 2795 278aa ENSMUSP00000021471.6 Protein coding CCDS25958 Q8VBT0 TSL:1 201 GENCODE basic APPRIS P1

Tmx1- ENSMUST00000162277.1 710 124aa ENSMUSP00000123893.1 Nonsense mediated - F6V084 CDS 5' 203 decay incomplete TSL:3

Tmx1- ENSMUST00000160865.1 648 No - Retained intron - - TSL:2 202 protein

34.95 kb Forward strand 70.45Mb 70.46Mb 70.47Mb (Comprehensive set... Tmx1-201 >protein coding

Tmx1-203 >nonsense mediated decay

Tmx1-202 >retained intron

Contigs AC122354.4 > Regulatory Build

70.45Mb 70.46Mb 70.47Mb Reverse strand 34.95 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000021471

14.95 kb Forward strand

Tmx1-201 >protein coding

ENSMUSP00000021... Transmembrane heli... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Cleavage site (Sign... Superfamily Thioredoxin-like superfamily Pfam Thioredoxin domain PROSITE profiles Thioredoxin domain PROSITE patterns Thioredoxin, conserved site PANTHER PTHR46107

PTHR46107:SF2 Gene3D 3.40.30.10 CDD cd02994

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 278

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8