https://www.alphaknockout.com
Mouse Tm4sf1 Knockout Project (CRISPR/Cas9)
Objective: To create a Tm4sf1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.
Strategy summary: The Tm4sf1 gene (NCBI Reference Sequence: NM_008536 ; Ensembl: ENSMUSG00000027800 ) is located on Mouse chromosome 3. 7 exons are identified, with the ATG start codon in exon 3 and the TAA stop codon in exon 7 (Transcript: ENSMUST00000196979). Exon 3~7 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:
Exon 3 starts from about 0.17% of the coding region. Exon 3~7 covers 100.0% of the coding region. The size of effective KO region: ~7028 bp. The KO region does not have any other known gene.
Page 1 of 8 https://www.alphaknockout.com
Overview of the Targeting Strategy
Wildtype allele 5' gRNA region gRNA region 3'
1 3 4 5 6 7
Legends Exon of mouse Tm4sf1 Knockout region
Page 2 of 8 https://www.alphaknockout.com
Overview of the Dot Plot (up) Window size: 15 bp
Forward Reverse Complement
Sequence 12
Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.
Overview of the Dot Plot (down) Window size: 15 bp
Forward Reverse Complement
Sequence 12
Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.
Page 3 of 8 https://www.alphaknockout.com
Overview of the GC Content Distribution (up) Window size: 300 bp
Sequence 12
Summary: Full Length(2000bp) | A(26.4% 528) | C(22.75% 455) | T(27.25% 545) | G(23.6% 472)
Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.
Overview of the GC Content Distribution (down) Window size: 300 bp
Sequence 12
Summary: Full Length(2000bp) | A(29.5% 590) | C(20.2% 404) | T(29.8% 596) | G(20.5% 410)
Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.
Page 4 of 8 https://www.alphaknockout.com
BLAT Search Results (up)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr3 - 57294789 57296788 2000 browser details YourSeq 27 913 951 2000 96.6% chr1 + 176019877 176019967 91 browser details YourSeq 21 1962 1982 2000 100.0% chr4 - 6976486 6976506 21 browser details YourSeq 21 740 761 2000 100.0% chr1 + 193867219 193867241 23
Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.
BLAT Search Results (down)
QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr3 - 57285759 57287758 2000 browser details YourSeq 70 1924 1998 2000 97.4% chr11 + 51800401 51910802 110402 browser details YourSeq 69 1909 1999 2000 89.1% chr13 + 29845633 30202041 356409 browser details YourSeq 60 1919 1998 2000 87.5% chr5 - 126708500 126708579 80 browser details YourSeq 60 1917 1996 2000 86.9% chr12 + 99552821 99552899 79 browser details YourSeq 60 1921 1992 2000 95.6% chr12 + 86908029 86908100 72 browser details YourSeq 59 1924 1998 2000 92.9% chr12 - 79972698 79972772 75 browser details YourSeq 59 1921 1997 2000 88.4% chr2 + 71711490 71711566 77 browser details YourSeq 59 1922 1992 2000 91.6% chr16 + 37348860 37348930 71 browser details YourSeq 58 1921 1998 2000 87.2% chr10 - 53980922 53980999 78 browser details YourSeq 57 1924 1998 2000 88.0% chr12 - 8768636 8768710 75 browser details YourSeq 57 1924 1998 2000 88.0% chr11 + 44436380 44436454 75 browser details YourSeq 56 1928 1993 2000 96.8% chr6 - 94300002 94300067 66 browser details YourSeq 56 1927 1998 2000 95.2% chr1 - 153350591 153350663 73 browser details YourSeq 56 1927 1992 2000 96.8% chr16 + 32030100 32030165 66 browser details YourSeq 56 1924 1997 2000 87.9% chr13 + 114829268 114829341 74 browser details YourSeq 56 1927 1992 2000 96.8% chr13 + 98928659 98928724 66 browser details YourSeq 56 1921 1979 2000 98.4% chr10 + 36426568 36426627 60 browser details YourSeq 56 1927 1998 2000 88.9% chr1 + 88765725 88765796 72 browser details YourSeq 55 1926 1998 2000 87.7% chr10 - 63832680 63832752 73
Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.
Page 5 of 8 https://www.alphaknockout.com
Gene and protein information: Tm4sf1 transmembrane 4 superfamily member 1 [ Mus musculus (house mouse) ] Gene ID: 17112, updated on 24-Oct-2019
Gene summary
Official Symbol Tm4sf1 provided by MGI Official Full Name transmembrane 4 superfamily member 1 provided by MGI Primary source MGI:MGI:104678 See related Ensembl:ENSMUSG00000027800 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as L6; M3s1 Expression Broad expression in lung adult (RPKM 43.9), heart adult (RPKM 24.6) and 19 other tissuesS ee more Orthologs human all
Genomic context
Location: 3; 3 D See Tm4sf1 in Genome Data Viewer Exon count: 10
Annotation release Status Assembly Chr Location
108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (57285611..57387736, complement)
Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (57090986..57105841, complement)
Chromosome 3 - NC_000069.6
Page 6 of 8 https://www.alphaknockout.com
Transcript information: This gene has 6 transcripts
Gene: Tm4sf1 ENSMUSG00000027800
Description transmembrane 4 superfamily member 1 [Source:MGI Symbol;Acc:MGI:104678] Gene Synonyms 12A8 target antigen, L6, L6 antigen, M3s1 Location Chromosome 3: 57,285,611-57,301,988 reverse strand. GRCm38:CM000996.2 About this gene This gene has 6 transcripts (splice variants), 122 orthologues, 4 paralogues and is a member of 1 Ensembl protein family. Transcripts
Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags
Tm4sf1-205 ENSMUST00000196979.4 3372 202aa ENSMUSP00000143652.1 Protein coding CCDS38433 Q64302 TSL:1 GENCODE basic APPRIS P1
Tm4sf1-202 ENSMUST00000171384.7 2855 202aa ENSMUSP00000130999.1 Protein coding CCDS38433 Q64302 TSL:1 GENCODE basic APPRIS P1
Tm4sf1-201 ENSMUST00000029376.12 1436 202aa ENSMUSP00000029376.8 Protein coding CCDS38433 Q64302 TSL:1 GENCODE basic APPRIS P1
Tm4sf1-203 ENSMUST00000196506.1 747 136aa ENSMUSP00000143697.1 Protein coding - A0A0G2JGU1 CDS 3' incomplete TSL:2
Tm4sf1-204 ENSMUST00000196704.1 798 No protein - Retained intron - - TSL:3
Tm4sf1-206 ENSMUST00000198030.1 767 No protein - lncRNA - - TSL:3
36.38 kb Forward strand 57.28Mb 57.29Mb 57.30Mb 57.31Mb Contigs AC125103.4 > < AC119854.7
Genes (Comprehensive set... < Tm4sf1-202protein coding
< Tm4sf1-205protein coding
< Tm4sf1-201protein coding
< Tm4sf1-206lncRNA
< Tm4sf1-203protein coding
< Tm4sf1-204retained intron
Regulatory Build
57.28Mb 57.29Mb 57.30Mb 57.31Mb Reverse strand 36.38 kb
Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank
Gene Legend Protein Coding
Ensembl protein coding merged Ensembl/Havana
Non-Protein Coding
RNA gene processed transcript
Page 7 of 8 https://www.alphaknockout.com
Transcript: ENSMUST00000196979
< Tm4sf1-205protein coding
Reverse strand 16.33 kb
ENSMUSP00000143... Transmembrane heli... Low complexity (Seg) Pfam L6 membrane PANTHER PTHR14198:SF18
L6 membrane
All sequence SNPs/i... Sequence variants (dbSNP and all other sources)
Variant Legend missense variant synonymous variant
Scale bar 0 20 40 60 80 100 120 140 160 180 202
We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.
Page 8 of 8