Mouse Wdr4 Knockout Project (CRISPR/Cas9)
Total Page:16
File Type:pdf, Size:1020Kb
http://www.alphaknockout.com/ Mouse Wdr4 Knockout Project (CRISPR/Cas9) Objective: To create a Wdr4 knockout mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Wdr4 gene ( NCBI Reference Sequence: NM_021322 ; Ensembl: ENSMUSG00000024037 ) is located on mouse chromosome 17. 11 exons are identified , with the ATG start codon in exon 1 and the TGA stop codon in exon 11 (Transcript Wdr4- 207: ENSMUST00000171171). Exon 4~6 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null allele display lethality during organogenesis with increased apoptosis and DNA damage. Exon 4 starts from about 31.58% of the coding region. Exon 4~6 covers 24.2% of the coding region. The size of effective KO region: ~2971 bp. Page 1 of 9 http://www.alphaknockout.com/ Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 4 5 6 11 Legends Exon of mouse Wdr4 Knockout region Page 2 of 9 http://www.alphaknockout.com/ Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 502 bp section downstream of Exon 6 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats. Page 3 of 9 http://www.alphaknockout.com/ Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(19.7% 394) | C(25.0% 500) | G(28.95% 579) | T(26.35% 527) Note: The 2000 bp section upstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(502bp) | A(16.14% 81) | C(28.09% 141) | G(25.7% 129) | T(30.08% 151) Note: The 502 bp section downstream of Exon 6 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 9 http://www.alphaknockout.com/ BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr17 - 31503667 31505666 2000 browser details YourSeq 78 1403 1531 2000 90.6% chr1 - 183592734 183592876 143 browser details YourSeq 76 1408 1545 2000 89.3% chr1 - 135169283 135259536 90254 browser details YourSeq 67 1401 1531 2000 90.2% chr14 - 61692495 61692636 142 browser details YourSeq 65 1397 1518 2000 91.1% chr2 + 37462331 37462454 124 browser details YourSeq 55 1472 1547 2000 86.9% chr11 - 116838090 117243524 405435 browser details YourSeq 55 1394 1522 2000 92.7% chr13 + 49307979 49308127 149 browser details YourSeq 55 1471 1534 2000 93.7% chr10 + 93874592 93874655 64 browser details YourSeq 53 1472 1535 2000 92.1% chr11 - 3624955 3625019 65 browser details YourSeq 52 1471 1531 2000 89.7% chr16 + 32182412 32182470 59 browser details YourSeq 52 1415 1518 2000 89.6% chr14 + 119013826 119013948 123 browser details YourSeq 52 1471 1538 2000 95.0% chr11 + 72079611 72079685 75 browser details YourSeq 51 1443 1522 2000 96.5% chr1 - 156397811 156398239 429 browser details YourSeq 50 1461 1522 2000 93.3% chr7 - 19144318 19144381 64 browser details YourSeq 50 1393 1492 2000 96.3% chr12 - 111057628 111057727 100 browser details YourSeq 50 1365 1530 2000 88.9% chr9 + 21182432 21182738 307 browser details YourSeq 50 1471 1540 2000 87.2% chr19 + 36745204 36745276 73 browser details YourSeq 49 1472 1531 2000 87.8% chr12 + 21568059 21568116 58 browser details YourSeq 48 1423 1519 2000 92.8% chr7 - 126214472 126214587 116 browser details YourSeq 48 1475 1531 2000 93.0% chr10 + 126868211 126868268 58 Note: The 2000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 502 1 502 502 100.0% chr17 - 31500713 31501214 502 browser details YourSeq 44 409 452 502 100.0% chr10 + 117523853 117523896 44 browser details YourSeq 32 416 450 502 97.2% chr2 - 169588768 169588818 51 browser details YourSeq 25 141 170 502 85.2% chr12 + 63871936 63871963 28 browser details YourSeq 24 131 160 502 76.0% chr14 - 121746354 121746378 25 browser details YourSeq 22 436 459 502 95.9% chr13 - 14010069 14010092 24 browser details YourSeq 22 144 169 502 92.4% chr1 - 194539660 194539685 26 browser details YourSeq 21 311 331 502 100.0% chr13 + 108860314 108860334 21 browser details YourSeq 20 161 180 502 100.0% chr14 - 59795218 59795237 20 Note: The 502 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found. Page 5 of 9 http://www.alphaknockout.com/ Gene and protein information: Wdr4 WD repeat domain 4 [ Mus musculus (house mouse) ] Gene ID: 57773, updated on 10-Oct-2019 Gene summary Official Symbol Wdr4 provided by MGI Official Full Name WD repeat domain 4 provided by MGI Primary source MGI:MGI:1889002 See related Ensembl:ENSMUSG00000024037 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Wh; mWH; AI415180; AI448349; D530049K22Rik Expression Ubiquitous expression in ovary adult (RPKM 9.8), limb E14.5 (RPKM 7.4) and 28 other tissues See more Orthologs human all Genomic context Location: 17; 17 B1 See Wdr4 in Genome Data Viewer Exon count: 13 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (31494322..31519985, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (31631267..31649432, complement) Chromosome 17 - NC_000083.6 Page 6 of 9 http://www.alphaknockout.com/ Transcript information: This gene has 14 transcripts Gene: Wdr4 ENSMUSG00000024037 Description WD repeat domain 4 [Source:MGI Symbol;Acc:MGI:1889002] Gene Synonyms D530049K22Rik, Wh Location Chromosome 17: 31,494,322-31,519,980 reverse strand. GRCm38:CM001010.2 About this gene This gene has 14 transcripts (splice variants), 187 orthologues, is a member of 1 Ensembl protein family and is associated with 5 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Wdr4- ENSMUST00000171171.8 4154 456aa ENSMUSP00000126061.1 Protein coding CCDS28606 E9Q156 TSL:1 207 GENCODE basic APPRIS P1 Wdr4- ENSMUST00000236665.1 1897 426aa ENSMUSP00000157778.1 Protein coding - A0A494B9R5 CDS 5' 211 incomplete Wdr4- ENSMUST00000171291.1 599 46aa ENSMUSP00000132290.1 Protein coding - E9Q1L4 CDS 3' 208 incomplete TSL:3 Wdr4- ENSMUST00000237807.1 395 33aa ENSMUSP00000157734.1 Protein coding - A0A494B9T1 CDS 3' 214 incomplete Wdr4- ENSMUST00000237127.1 361 120aa ENSMUSP00000158546.1 Protein coding - A0A494BBN7 CDS 5' and 3' 213 incomplete Wdr4- ENSMUST00000172284.8 3633 102aa ENSMUSP00000129736.2 Nonsense mediated - F6YNC4 CDS 5' 210 decay incomplete TSL:1 Wdr4- ENSMUST00000170176.8 3513 107aa ENSMUSP00000127073.1 Nonsense mediated - F6VLT8 CDS 5' 206 decay incomplete TSL:1 Wdr4- ENSMUST00000169454.7 1300 No - Retained intron - - TSL:1 205 protein Wdr4- ENSMUST00000169224.2 770 No - Retained intron - - TSL:3 204 protein Wdr4- ENSMUST00000166992.1 769 No - Retained intron - - TSL:3 202 protein Wdr4- ENSMUST00000167419.7 813 No - lncRNA - - TSL:3 203 protein Wdr4- ENSMUST00000236913.1 761 No - lncRNA - - - 212 protein Wdr4- ENSMUST00000166626.1 454 No - lncRNA - - TSL:3 201 protein Wdr4- ENSMUST00000171592.1 423 No - lncRNA - - TSL:3 209 protein Page 7 of 9 http://www.alphaknockout.com/ 45.66 kb Forward strand 31.71Mb 31.72Mb 31.73Mb 31.74Mb Genes (Comprehensive set from GENCODE M... Gm50103-201 >lncRNA Ndufv3-202 >protein coding Ndufv3-201 >protein coding Ndufv3-204 >processed transcript Ndufv3-205 >protein coding Ndufv3-203 >processed transcript Contigs < AC166172.1 Genes (Comprehensive set from GENCODE M... < Gm50102-201TEC < Wdr4-207protein coding < Wdr4-210nonsense mediated decay < Wdr4-209processed transcript < Wdr4-206nonsense mediated decay < Wdr4-211protein coding < Wdr4-205retained intron < Wdr4-204retained intron < Wdr4-202retained intron < Wdr4-213protein coding < Wdr4-212processed transcript < Wdr4-208protein coding < Wdr4-203processed transcript < Wdr4-201processed transcript < Wdr4-214protein coding Regulatory Build 31.71Mb 31.72Mb 31.73Mb 31.74Mb Reverse strand 45.66 kb Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding processed transcript RNA gene Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Page 8 of 9 http://www.alphaknockout.com/ Transcript: ENSMUST00000171171 < Wdr4-207protein coding Reverse strand 18.43 kb ENSMUSP000001260... MobiDB lite Low complexity (Seg) Superfamily WD40-repeat-containing domain superfamily SMART WD40 repeat Pfam WD40 repeat PROSITE profiles WD40 repeat WD40-repeat-containing domain PANTHER tRNA (guanine-N(7)-)-methyltransferase non-catalytic subunit HAMAP tRNA (guanine-N(7)-)-methyltransferase non-catalytic subunit Gene3D WD40/YVTN repeat-like-containing domain superfamily All sequence SNPs/in... Sequence variants (dbSNP and all other sources) Variant Legend frameshift variant missense variant splice region variant synonymous variant Scale bar 0 40 80 120 160 200 240 280 320 360 400 456 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.