Mouse Tmem106c Knockout Project (CRISPR/Cas9)

https://www.alphaknockout.com Mouse Tmem106c Knockout Project (CRISPR/Cas9) Objective: To create a Tmem106c knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Tmem106c gene (NCBI Reference Sequence: NM_201359 ; Ensembl: ENSMUSG00000052369 ) is located on Mouse chromosome 15. 8 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 8 (Transcript: ENSMUST00000064200). Exon 2~8 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 2 starts from about 0.13% of the coding region. Exon 2~8 covers 100.0% of the coding region. The size of effective KO region: ~4867 bp. The KO region does not have any other known gene. Page 1 of 9 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 2 3 4 5 6 7 8 Legends Exon of mouse Tmem106c Knockout region Page 2 of 9 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats. Page 3 of 9 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(23.75% 475) | C(25.3% 506) | T(25.05% 501) | G(25.9% 518) Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(24.0% 480) | C(25.35% 507) | T(26.35% 527) | G(24.3% 486) Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 9 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr15 + 97962851 97964850 2000 browser details YourSeq 132 95 689 2000 79.0% chr9 - 53542846 53543090 245 browser details YourSeq 132 536 713 2000 89.3% chr2 + 145955648 145955831 184 browser details YourSeq 128 545 716 2000 88.8% chr11 - 74732800 74732970 171 browser details YourSeq 127 545 716 2000 90.0% chr10 - 61403512 61403890 379 browser details YourSeq 126 544 713 2000 90.9% chr14 + 102397628 102397798 171 browser details YourSeq 126 545 716 2000 88.9% chr1 + 36163767 36163946 180 browser details YourSeq 125 536 713 2000 83.6% chrX + 25501643 25501809 167 browser details YourSeq 125 549 716 2000 87.5% chr2 + 71390408 71390737 330 browser details YourSeq 124 548 713 2000 87.5% chrX - 144270550 144270711 162 browser details YourSeq 123 545 707 2000 90.2% chr4 + 139849381 139849544 164 browser details YourSeq 122 400 689 2000 88.7% chr8 - 32769572 32770137 566 browser details YourSeq 121 545 687 2000 92.4% chr8 - 88244061 88244203 143 browser details YourSeq 121 545 689 2000 91.8% chr12 + 84509301 84509445 145 browser details YourSeq 120 549 716 2000 90.6% chr1 - 54965170 54965342 173 browser details YourSeq 119 543 689 2000 90.5% chr1 - 9841486 9841632 147 browser details YourSeq 119 545 689 2000 91.1% chr4 + 143030059 143030203 145 browser details YourSeq 118 542 689 2000 89.9% chr9 - 106597930 106598077 148 browser details YourSeq 117 545 713 2000 89.4% chrX - 101181833 101182008 176 browser details YourSeq 117 545 689 2000 90.4% chr1 - 80299696 80299840 145 Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr15 + 97969718 97971717 2000 browser details YourSeq 139 1845 2000 2000 94.9% chrX + 6858350 6858507 158 browser details YourSeq 137 1849 2000 2000 95.4% chr19 + 21537646 21537799 154 browser details YourSeq 136 1849 2000 2000 96.0% chr17 - 82389282 82389434 153 browser details YourSeq 136 1848 2000 2000 95.4% chr2 + 33888926 33889078 153 browser details YourSeq 135 1832 2000 2000 87.7% chr8 - 94724245 94724406 162 browser details YourSeq 135 1849 2000 2000 94.8% chr5 - 63543892 63544045 154 browser details YourSeq 135 1847 2000 2000 93.4% chr17 - 53464865 53465017 153 browser details YourSeq 135 1849 2000 2000 94.8% chr6 + 35185430 35185582 153 browser details YourSeq 135 1842 2000 2000 93.1% chr5 + 117340337 117340502 166 browser details YourSeq 135 1849 2000 2000 94.8% chr4 + 35213982 35214135 154 browser details YourSeq 135 1848 1999 2000 94.8% chr13 + 92262611 92262764 154 browser details YourSeq 134 1849 2000 2000 95.4% chr7 - 96164071 96164223 153 browser details YourSeq 134 1849 2000 2000 94.1% chr2 - 3497712 3497863 152 browser details YourSeq 134 1849 2000 2000 94.1% chr15 - 35322514 35322665 152 browser details YourSeq 134 1848 2000 2000 94.2% chr14 - 45265181 45265335 155 browser details YourSeq 134 1848 2000 2000 93.4% chr14 + 106164371 106164522 152 browser details YourSeq 134 1845 2000 2000 91.7% chr10 + 59299196 59299350 155 browser details YourSeq 133 1848 2000 2000 94.7% chr8 - 120675421 120675573 153 browser details YourSeq 133 1847 2000 2000 93.6% chr3 - 53775410 53775565 156 Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found. Page 5 of 9 https://www.alphaknockout.com Gene and protein information: Tmem106c transmembrane protein 106C [ Mus musculus (house mouse) ] Gene ID: 380967, updated on 24-Oct-2019 Gene summary Official Symbol Tmem106c provided by MGI Official Full Name transmembrane protein 106C provided by MGI Primary source MGI:MGI:1196384 See related Ensembl:ENSMUSG00000052369 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI046681; BC046621; D15Ertd405e Expression Ubiquitous expression in genital fat pad adult (RPKM 24.2), bladder adult (RPKM 18.3) and 28 other tissues See more Orthologs human all Genomic context Location: 15 F1; 15 53.96 cM See Tmem106c in Genome Data Viewer Exon count: 8 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 15 NC_000081.6 (97963177..97970286) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 15 NC_000081.5 (97794710..97800706) Chromosome 15 - NC_000081.6 Page 6 of 9 https://www.alphaknockout.com Transcript information: This gene has 9 transcripts Gene: Tmem106c ENSMUSG00000052369 Description transmembrane protein 106C [Source:MGI Symbol;Acc:MGI:1196384] Gene Synonyms D15Ertd405e Location Chromosome 15: 97,964,200-97,970,275 forward strand. GRCm38:CM001008.2 About this gene This gene has 9 transcripts (splice variants), 166 orthologues and is a member of 1 Ensembl protein family. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Tmem106c- ENSMUST00000064200.8 1623 260aa ENSMUSP00000069764.7 Protein coding CCDS27785 Q80VP8 TSL:1 201 GENCODE basic APPRIS P1 Tmem106c- ENSMUST00000229428.1 1455 260aa ENSMUSP00000154819.1 Protein coding CCDS27785 Q80VP8 GENCODE basic 202 APPRIS P1 Tmem106c- ENSMUST00000231144.1 734 166aa ENSMUSP00000155384.1 Protein coding - A0A2R8VHS8 CDS 3' 209 incomplete Tmem106c- ENSMUST00000229433.1 631 165aa ENSMUSP00000154837.1 Protein coding - A0A2R8VJM7 CDS 3' 203 incomplete Tmem106c- ENSMUST00000230072.1 401 116aa ENSMUSP00000155091.1 Protein coding - A0A2R8W6L2 CDS 3' 205 incomplete Tmem106c- ENSMUST00000230005.1 927 No - Retained - - - 204 protein intron Tmem106c- ENSMUST00000230144.1 727 No - Retained - - - 206 protein intron Tmem106c- ENSMUST00000231079.1 571 No - Retained - - - 208 protein intron Tmem106c- ENSMUST00000230361.1 585 No - lncRNA - - - 207 protein Page 7 of 9 https://www.alphaknockout.com 26.08 kb Forward strand 97.96Mb 97.97Mb 97.98Mb Genes (Comprehensive set... Tmem106c-203 >protein coding Tmem106c-202 >protein coding Tmem106c-201 >protein coding Tmem106c-205 >protein coding Tmem106c-209 >protein coding Tmem106c-206 >retained intron Tmem106c-207 >lncRNA Tmem106c-204 >retained intron Tmem106c-208 >retained intron Contigs AC158787.14 > AC134554.6 > Genes < Col2a1-201protein coding (Comprehensive set... < Col2a1-202protein coding < Col2a1-205lncRNA Regulatory Build 97.96Mb 97.97Mb 97.98Mb Reverse strand 26.08 kb Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding processed transcript RNA gene Page 8 of 9 https://www.alphaknockout.com Transcript: ENSMUST00000064200 6.05 kb Forward strand Tmem106c-201 >protein coding ENSMUSP00000069... Transmembrane heli... Low complexity (Seg) Pfam Protein of unknown function DUF1356, TMEM106 PANTHER PTHR28556:SF5 Protein of unknown function DUF1356, TMEM106 All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend missense variant synonymous variant Scale bar 0 40 80 120 160 200 260 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 9 of 9.

Mouse Tmem106c Knockout Project (CRISPR/Cas9)

Evolutionary Plasticity in Detoxification Gene Modules: the Preservation

A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus

Increased Expression of the Frontotemporal Dementia Risk Factor Tmem106b Causes C9orf72-Dependent Alterations in Lysosomes

Evolutionary History of Tibetans Inferred from Whole-Genome Sequencing

Nuclear Receptors in Metazoan Lineages: the Cross-Talk Between Evolution and Endocrine Disruption

Content Based Search in Gene Expression Databases and a Meta-Analysis of Host Responses to Infection

Discerning the Role of Foxa1 in Mammary Gland

Sea Anemone Genome Reveals the Gene Repertoire and Genomic Organization of the Eumetazoan Ancestor

Leveraging Models of Cell Regulation and GWAS Data in Integrative Network-Based Association Studies

A Network Inference Approach to Understanding Musculoskeletal

Supplementary Table S2. List of Genes with Expression That Is Positively Correlated (Pearson Correlation Coefficient P>0.3)

Coexpression Networks Based on Natural Variation in Human Gene Expression at Baseline and Under Stress