Mouse Themis Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
https://www.alphaknockout.com Mouse Themis Knockout Project (CRISPR/Cas9) Objective: To create a Themis knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Themis gene (NCBI Reference Sequence: NM_178666 ; Ensembl: ENSMUSG00000049109 ) is located on Mouse chromosome 10. 6 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 5 (Transcript: ENSMUST00000056097). Exon 4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous null mice have defects in T cell positive selection that leads to very few alpha-beta T cells being found in the periphery. Exon 4 starts from about 37.37% of the coding region. Exon 4 covers 55.14% of the coding region. The size of effective KO region: ~1052 bp. The KO region does not have any other known gene. Page 1 of 9 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 4 6 Legends Exon of mouse Themis Knockout region Page 2 of 9 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of Exon 4 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats. Page 3 of 9 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(33.25% 665) | C(15.0% 300) | T(34.0% 680) | G(17.75% 355) Note: The 2000 bp section upstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(32.1% 642) | C(16.35% 327) | T(30.8% 616) | G(20.75% 415) Note: The 2000 bp section downstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 9 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr10 + 28779692 28781691 2000 browser details YourSeq 30 764 849 2000 94.2% chrX + 53347115 53347202 88 browser details YourSeq 23 749 777 2000 89.7% chr1 - 132087700 132087728 29 browser details YourSeq 23 751 775 2000 87.5% chr2 + 86204200 86204223 24 browser details YourSeq 20 1383 1402 2000 100.0% chr2 + 105299604 105299623 20 Note: The 2000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr10 + 28782744 28784743 2000 browser details YourSeq 124 1259 1382 2000 100.0% chr17 + 94030130 94030253 124 browser details YourSeq 118 1259 1380 2000 98.4% chr9 - 107383449 107383570 122 browser details YourSeq 113 1259 1373 2000 99.2% chr5 + 12983446 12983560 115 browser details YourSeq 112 1258 1380 2000 96.0% chr6 - 28146081 28146204 124 browser details YourSeq 112 1262 1373 2000 100.0% chr4 + 33071707 33071818 112 browser details YourSeq 111 1259 1369 2000 100.0% chr3 + 48906684 48906794 111 browser details YourSeq 110 1259 1371 2000 99.2% chr8 - 42352268 42352380 113 browser details YourSeq 110 1262 1373 2000 99.2% chr1 - 142409106 142409217 112 browser details YourSeq 109 1259 1370 2000 99.1% chr12 - 78064424 78064537 114 browser details YourSeq 108 1259 1366 2000 100.0% chr14 + 81587685 81587792 108 browser details YourSeq 107 1259 1365 2000 100.0% chr6 - 5875411 5875517 107 browser details YourSeq 105 1258 1366 2000 98.2% chr14 - 56279944 56280052 109 browser details YourSeq 105 1259 1363 2000 100.0% chr12 + 90094162 90094266 105 browser details YourSeq 104 1259 1364 2000 99.1% chr19 + 51365077 51365182 106 browser details YourSeq 103 1259 1363 2000 99.1% chr17 - 61027844 61027948 105 browser details YourSeq 103 1259 1361 2000 100.0% chr1 - 160035146 160035248 103 browser details YourSeq 103 1259 1363 2000 99.1% chr9 + 90559869 90559973 105 browser details YourSeq 101 1259 1359 2000 100.0% chr8 - 83219255 83219355 101 browser details YourSeq 101 1259 1366 2000 97.2% chr12 - 28133152 28133260 109 Note: The 2000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found. Page 5 of 9 https://www.alphaknockout.com Gene and protein information: Themis thymocyte selection associated [ Mus musculus (house mouse) ] Gene ID: 210757, updated on 24-Oct-2019 Gene summary Official Symbol Themis provided by MGI Official Full Name thymocyte selection associated provided by MGI Primary source MGI:MGI:2443552 See related Ensembl:ENSMUSG00000049109 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gasp; Spot; Tsepa; thylex; E430004N04Rik Summary This gene encodes a protein that plays a regulatory role in both positive and negative T-cell selection during late thymocyte Expression development. The protein functions through T-cell antigen receptor signaling, and is necessary for proper lineage commitment and maturation of T-cells. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Mar 2015] Orthologs Restricted expression toward thymus adult (RPKM 8.8) See more human all Genomic context Location: 10; 10 A4 See Themis in Genome Data Viewer Exon count: 10 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (28668327..28883820) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (28388201..28602555) Chromosome 10 - NC_000076.6 Page 6 of 9 https://www.alphaknockout.com Transcript information: This gene has 8 transcripts Gene: Themis ENSMUSG00000049109 Description thymocyte selection associated [Source:MGI Symbol;Acc:MGI:2443552] Gene Synonyms E430004N04Rik, Gasp, Tsepa Location Chromosome 10: 28,668,360-28,883,818 forward strand. GRCm38:CM001003.2 About this gene This gene has 8 transcripts (splice variants), 153 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 32 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Themis- ENSMUST00000056097.10 5082 636aa ENSMUSP00000060129.4 Protein coding CCDS23758 Q8BGW0 TSL:1 201 GENCODE basic APPRIS P2 Themis- ENSMUST00000105516.8 2361 595aa ENSMUSP00000101155.2 Protein coding - Q8BGW0 TSL:1 203 GENCODE basic APPRIS ALT2 Themis- ENSMUST00000060409.12 2278 605aa ENSMUSP00000055315.6 Protein coding - Q8BGW0 TSL:1 202 GENCODE basic APPRIS ALT2 Themis- ENSMUST00000161345.1 636 203aa ENSMUSP00000123894.1 Protein coding - E0CYT7 CDS 3' 205 incomplete TSL:3 Themis- ENSMUST00000159927.7 4827 94aa ENSMUSP00000123919.1 Nonsense mediated - E0CY68 TSL:1 204 decay Themis- ENSMUST00000162202.7 1252 94aa ENSMUSP00000124451.1 Nonsense mediated - E0CY68 TSL:1 206 decay Themis- ENSMUST00000219119.1 3461 No - Retained intron - - TSL:NA 208 protein Themis- ENSMUST00000162343.1 3403 No - Retained intron - - TSL:1 207 protein Page 7 of 9 https://www.alphaknockout.com 235.46 kb Forward strand 28.70Mb 28.75Mb 28.80Mb 28.85Mb Genes (Comprehensive set... Themis-201 >protein coding Themis-202 >protein coding Themis-207 >retained intron Themis-203 >protein coding Themis-204 >nonsense mediated decay Themis-205 >protein coding Themis-206 >nonsense mediated decay Themis-208 >retained intron Contigs AC152983.2 > < AC159472.6 Genes < Gm47834-201processed pseudogene < 4930519F09Rik-201lncRNA (Comprehensive set... Regulatory Build 28.70Mb 28.75Mb 28.80Mb 28.85Mb Reverse strand 235.46 kb Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site Gene Legend Protein Coding merged Ensembl/Havana Ensembl protein coding Non-Protein Coding pseudogene processed transcript RNA gene Page 8 of 9 https://www.alphaknockout.com Transcript: ENSMUST00000056097 215.46 kb Forward strand Themis-201 >protein coding ENSMUSP00000060... MobiDB lite Low complexity (Seg) Pfam CABIT domain PANTHER PTHR15215:SF1 Protein THEMIS All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend missense variant synonymous variant Scale bar 0 60 120 180 240 300 360 420 480 540 636 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 9 of 9.