https://www.alphaknockout.com

Mouse Gm5741 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Gm5741 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Gm5741 (NCBI Reference Sequence: NM_001195531 ; Ensembl: ENSMUSG00000095845 ) is located on Mouse 8. 2 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 2 (Transcript: ENSMUST00000177563). Exon 1~2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Gm5741 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-75P15 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 1~2 covers 100.0% of the coding region. Start codon is in exon 1, and stop codon is in exon 2. The size of effective cKO region: ~837 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

gRNA region

Wildtype allele A T

5' G gRNA region 3'

1 2 10 9 8 7 6

Targeting vector A T G

Targeted allele A T G

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Gm5741 cKO region Exon of mouse Fbxw9 loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(6303bp) | A(26.24% 1654) | C(24.69% 1556) | T(23.26% 1466) | G(25.81% 1627)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 - 85067895 85070894 3000 browser details YourSeq 289 558 1396 3000 86.9% chr5 - 139485399 139486211 813 browser details YourSeq 269 557 891 3000 95.1% chr6 - 122331036 122532840 201805 browser details YourSeq 248 558 895 3000 92.3% chr13 - 12678161 12716828 38668 browser details YourSeq 234 605 891 3000 93.7% chr4 - 116529377 117179390 650014 browser details YourSeq 233 604 903 3000 91.8% chr10 + 81113424 81225280 111857 browser details YourSeq 229 625 894 3000 93.0% chr11 - 96140048 96140652 605 browser details YourSeq 205 599 894 3000 93.0% chr1 - 128280186 128321646 41461 browser details YourSeq 202 602 903 3000 87.5% chr10 - 76023198 76378430 355233 browser details YourSeq 197 559 1389 3000 83.3% chr11 + 60725674 60726046 373 browser details YourSeq 190 602 843 3000 92.1% chr18 + 42308982 42309619 638 browser details YourSeq 174 559 1391 3000 82.7% chr11 - 33165321 33165620 300 browser details YourSeq 174 604 903 3000 88.9% chr6 + 148997102 148997618 517 browser details YourSeq 170 559 1069 3000 93.9% chr17 + 69313279 69313833 555 browser details YourSeq 165 566 1381 3000 82.7% chr16 + 44187780 44188354 575 browser details YourSeq 164 558 903 3000 87.8% chr2 - 39097627 39097813 187 browser details YourSeq 157 673 896 3000 89.2% chr3 + 86648461 86649061 601 browser details YourSeq 155 558 903 3000 85.4% chr2 + 4805929 4806112 184 browser details YourSeq 152 560 903 3000 85.6% chr19 + 38465581 38465762 182 browser details YourSeq 152 558 894 3000 85.5% chr18 + 47449926 47450099 174

Note: The 3000 bp section upstream of Exon 1 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr8 - 85064592 85067591 3000 browser details YourSeq 77 2632 2915 3000 71.0% chr7 - 6934313 6934551 239 browser details YourSeq 76 2615 2916 3000 79.7% chr10 - 60102236 60102692 457 browser details YourSeq 73 2568 2716 3000 80.7% chr2 + 17136286 17136436 151 browser details YourSeq 72 2796 2914 3000 75.8% chr2 - 10018413 10018519 107 browser details YourSeq 68 2606 2716 3000 90.5% chr18 + 67831286 67831764 479 browser details YourSeq 67 562 693 3000 93.9% chr4 + 153652674 153652812 139 browser details YourSeq 66 2615 2714 3000 89.5% chr4 + 125613207 125613563 357 browser details YourSeq 62 2577 2713 3000 88.9% chr4 + 44855332 44855466 135 browser details YourSeq 61 2514 2643 3000 85.9% chrX - 130624869 130625003 135 browser details YourSeq 61 2815 2917 3000 79.7% chr9 + 73908240 73908342 103 browser details YourSeq 58 2606 2706 3000 86.3% chr10 - 82659773 82659873 101 browser details YourSeq 56 2245 2704 3000 72.7% chr1 - 87256377 87256787 411 browser details YourSeq 55 2567 2712 3000 88.8% chr1 + 164584941 164585087 147 browser details YourSeq 54 2618 2716 3000 77.4% chr4 - 108911381 108911476 96 browser details YourSeq 54 602 689 3000 80.7% chr11 - 75664087 75664174 88 browser details YourSeq 52 2609 2692 3000 86.4% chr14 - 118134186 118134268 83 browser details YourSeq 51 553 701 3000 94.8% chr13 + 46894501 46894650 150 browser details YourSeq 50 2758 2913 3000 96.3% chrX - 60363035 60363363 329 browser details YourSeq 50 2615 2716 3000 87.1% chr18 - 71663897 71663997 101

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Gm5741 predicted gene 5741 [ Mus musculus (house mouse) ] Gene ID: 100503710, updated on 12-Aug-2019

Gene summary

Official Symbol Gm5741 provided by MGI Official Full Name predicted gene 5741 provided by MGI Primary source MGI:MGI:3645690 See related Ensembl:ENSMUSG00000095845 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as EG436049 Expression Ubiquitous expression in whole brain E14.5 (RPKM 2.3), CNS E18 (RPKM 2.2) and 28 other tissues See more Orthologs human all

Genomic context

Location: 8; 8 C3 See Gm5741 in Genome Data Viewer

Exon count: 2

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 8 NC_000074.6 (85067568..85067982, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 8 NC_000074.5 (87591467..87591875, complement)

Chromosome 8 - NC_000074.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Gm5741 ENSMUSG00000095845

Description predicted gene 5741 [Source:MGI Symbol;Acc:MGI:3645690] Location Chromosome 8: 85,067,568-85,067,982 reverse strand. GRCm38:CM001001.2 About this gene This gene has 1 transcript (splice variant), 125 orthologues, 12 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Gm5741-201 ENSMUST00000177563.1 328 72aa ENSMUSP00000136655.1 Protein coding CCDS57629 B2RVA4 TSL:1 GENCODE basic APPRIS P1

Page 6 of 8 https://www.alphaknockout.com

20.41 kb Forward strand

85.060Mb 85.065Mb 85.070Mb 85.075Mb Tnpo2-201 >protein codingFbxw9-209 >retained intron Dhps-201 >protein coding (Comprehensive set...

Tnpo2-202 >protein codingFbxw9-205 >lncRNA Dhps-207 >nonsense mediated decay

Tnpo2-205 >retained intron Mir7070-201 >miRNA Fbxw9-202 >retained intron Dhps-202 >retained intronDhps-203 >retained intron

Fbxw9-201 >protein coding Dhps-205 >retained intron

Fbxw9-203 >retained intron Dhps-206 >retained intron

Fbxw9-207 >retained intron Fbxw9-206 >lncRNA Dhps-208 >retained intron

Fbxw9-208 >retained intron Dhps-204 >retained intron

Fbxw9-204 >retained intron

Contigs < AC163703.4 Genes (Comprehensive set... < A230103J11Rik-201lncRNA < Gm5741-201protein coding < Wdr83-208nonsense mediated decay

< A230103J11Rik-202lncRNA < Wdr83-201protein coding

< Wdr83-204retained intron

< Wdr83-202retained intron

< Wdr83-206retained intron

< Wdr83-205protein coding

< Wdr83-207retained intron

< Wdr83-209retained intron

< Wdr83-210protein coding

Regulatory Build

85.060Mb 85.065Mb 85.070Mb 85.075Mb Reverse strand 20.41 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000177563

< Gm5741-201protein coding

Reverse strand 415 bp

ENSMUSP00000136... Superfamily G-protein gamma-like domain superfamily

SMART G-protein gamma-like domain

SM01224 Prints G-protein, gamma subunit Pfam G-protein gamma-like domain

PROSITE profiles G-protein gamma-like domain

PANTHER PTHR13809:SF9

G-protein, gamma subunit Gene3D G-protein gamma-like domain superfamily CDD G-protein gamma-like domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Y R

Variant Legend

synonymous variant

Scale bar 0 8 16 24 32 40 48 56 64 72

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8