Mouse Tm4sf1 Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Tm4sf1 Knockout Project (CRISPR/Cas9) Objective: To create a Tm4sf1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Tm4sf1 gene (NCBI Reference Sequence: NM_008536 ; Ensembl: ENSMUSG00000027800 ) is located on Mouse chromosome 3. 7 exons are identified, with the ATG start codon in exon 3 and the TAA stop codon in exon 7 (Transcript: ENSMUST00000196979). Exon 3~7 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 3 starts from about 0.17% of the coding region. Exon 3~7 covers 100.0% of the coding region. The size of effective KO region: ~7028 bp. The KO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 3 4 5 6 7 Legends Exon of mouse Tm4sf1 Knockout region Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 8 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(26.4% 528) | C(22.75% 455) | T(27.25% 545) | G(23.6% 472) Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(29.5% 590) | C(20.2% 404) | T(29.8% 596) | G(20.5% 410) Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 8 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr3 - 57294789 57296788 2000 browser details YourSeq 27 913 951 2000 96.6% chr1 + 176019877 176019967 91 browser details YourSeq 21 1962 1982 2000 100.0% chr4 - 6976486 6976506 21 browser details YourSeq 21 740 761 2000 100.0% chr1 + 193867219 193867241 23 Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr3 - 57285759 57287758 2000 browser details YourSeq 70 1924 1998 2000 97.4% chr11 + 51800401 51910802 110402 browser details YourSeq 69 1909 1999 2000 89.1% chr13 + 29845633 30202041 356409 browser details YourSeq 60 1919 1998 2000 87.5% chr5 - 126708500 126708579 80 browser details YourSeq 60 1917 1996 2000 86.9% chr12 + 99552821 99552899 79 browser details YourSeq 60 1921 1992 2000 95.6% chr12 + 86908029 86908100 72 browser details YourSeq 59 1924 1998 2000 92.9% chr12 - 79972698 79972772 75 browser details YourSeq 59 1921 1997 2000 88.4% chr2 + 71711490 71711566 77 browser details YourSeq 59 1922 1992 2000 91.6% chr16 + 37348860 37348930 71 browser details YourSeq 58 1921 1998 2000 87.2% chr10 - 53980922 53980999 78 browser details YourSeq 57 1924 1998 2000 88.0% chr12 - 8768636 8768710 75 browser details YourSeq 57 1924 1998 2000 88.0% chr11 + 44436380 44436454 75 browser details YourSeq 56 1928 1993 2000 96.8% chr6 - 94300002 94300067 66 browser details YourSeq 56 1927 1998 2000 95.2% chr1 - 153350591 153350663 73 browser details YourSeq 56 1927 1992 2000 96.8% chr16 + 32030100 32030165 66 browser details YourSeq 56 1924 1997 2000 87.9% chr13 + 114829268 114829341 74 browser details YourSeq 56 1927 1992 2000 96.8% chr13 + 98928659 98928724 66 browser details YourSeq 56 1921 1979 2000 98.4% chr10 + 36426568 36426627 60 browser details YourSeq 56 1927 1998 2000 88.9% chr1 + 88765725 88765796 72 browser details YourSeq 55 1926 1998 2000 87.7% chr10 - 63832680 63832752 73 Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found. Page 5 of 8 https://www.alphaknockout.com Gene and protein information: Tm4sf1 transmembrane 4 superfamily member 1 [ Mus musculus (house mouse) ] Gene ID: 17112, updated on 24-Oct-2019 Gene summary Official Symbol Tm4sf1 provided by MGI Official Full Name transmembrane 4 superfamily member 1 provided by MGI Primary source MGI:MGI:104678 See related Ensembl:ENSMUSG00000027800 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as L6; M3s1 Expression Broad expression in lung adult (RPKM 43.9), heart adult (RPKM 24.6) and 19 other tissuesS ee more Orthologs human all Genomic context Location: 3; 3 D See Tm4sf1 in Genome Data Viewer Exon count: 10 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (57285611..57387736, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (57090986..57105841, complement) Chromosome 3 - NC_000069.6 Page 6 of 8 https://www.alphaknockout.com Transcript information: This gene has 6 transcripts Gene: Tm4sf1 ENSMUSG00000027800 Description transmembrane 4 superfamily member 1 [Source:MGI Symbol;Acc:MGI:104678] Gene Synonyms 12A8 target antigen, L6, L6 antigen, M3s1 Location Chromosome 3: 57,285,611-57,301,988 reverse strand. GRCm38:CM000996.2 About this gene This gene has 6 transcripts (splice variants), 122 orthologues, 4 paralogues and is a member of 1 Ensembl protein family. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Tm4sf1-205 ENSMUST00000196979.4 3372 202aa ENSMUSP00000143652.1 Protein coding CCDS38433 Q64302 TSL:1 GENCODE basic APPRIS P1 Tm4sf1-202 ENSMUST00000171384.7 2855 202aa ENSMUSP00000130999.1 Protein coding CCDS38433 Q64302 TSL:1 GENCODE basic APPRIS P1 Tm4sf1-201 ENSMUST00000029376.12 1436 202aa ENSMUSP00000029376.8 Protein coding CCDS38433 Q64302 TSL:1 GENCODE basic APPRIS P1 Tm4sf1-203 ENSMUST00000196506.1 747 136aa ENSMUSP00000143697.1 Protein coding - A0A0G2JGU1 CDS 3' incomplete TSL:2 Tm4sf1-204 ENSMUST00000196704.1 798 No protein - Retained intron - - TSL:3 Tm4sf1-206 ENSMUST00000198030.1 767 No protein - lncRNA - - TSL:3 36.38 kb Forward strand 57.28Mb 57.29Mb 57.30Mb 57.31Mb Contigs AC125103.4 > < AC119854.7 Genes (Comprehensive set... < Tm4sf1-202protein coding < Tm4sf1-205protein coding < Tm4sf1-201protein coding < Tm4sf1-206lncRNA < Tm4sf1-203protein coding < Tm4sf1-204retained intron Regulatory Build 57.28Mb 57.29Mb 57.30Mb 57.31Mb Reverse strand 36.38 kb Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding RNA gene processed transcript Page 7 of 8 https://www.alphaknockout.com Transcript: ENSMUST00000196979 < Tm4sf1-205protein coding Reverse strand 16.33 kb ENSMUSP00000143... Transmembrane heli... Low complexity (Seg) Pfam L6 membrane PANTHER PTHR14198:SF18 L6 membrane All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend missense variant synonymous variant Scale bar 0 20 40 60 80 100 120 140 160 180 202 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 8 of 8.