https://www.alphaknockout.com

Mouse Qrich2 Knockout Project (CRISPR/Cas9)

Objective: To create a Qrich2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Qrich2 (NCBI Reference Sequence: NM_001033267 ; Ensembl: ENSMUSG00000070331 ) is located on Mouse 11. 14 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 14 (Transcript: ENSMUST00000093909). Exon 3~13 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit male infertility associated with decreased epididymis weight, multiple morphological abnormalities of the sperm flagella, oligozoospermia, and asthenozoospermia.

Exon 3 starts from about 2.59% of the coding region. Exon 3~13 covers 86.37% of the coding region. The size of effective KO region: ~5501 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 7 8 9 10 11 12 13 14

Legends Exon of mouse Qrich2 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1358 bp section downstream of Exon 13 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.8% 436) | C(29.65% 593) | T(27.55% 551) | G(21.0% 420)

Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1358bp) | A(24.96% 339) | C(27.54% 374) | T(22.02% 299) | G(25.48% 346)

Note: The 1358 bp section downstream of Exon 13 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr11 - 116448414 116450413 2000 browser details YourSeq 169 279 569 2000 82.9% chr9 - 75642238 75642561 324 browser details YourSeq 166 301 569 2000 87.4% chr4 - 105008642 105008952 311 browser details YourSeq 163 340 569 2000 88.4% chr17 - 63736367 63736635 269 browser details YourSeq 163 1 253 2000 83.0% chr4 + 128494728 128494948 221 browser details YourSeq 162 288 569 2000 86.7% chr10 - 129734683 129734995 313 browser details YourSeq 161 342 569 2000 87.1% chr14 - 103369537 103369804 268 browser details YourSeq 160 298 558 2000 84.2% chr15 + 11144275 11144571 297 browser details YourSeq 159 288 555 2000 85.1% chr5 - 72507814 72508132 319 browser details YourSeq 149 335 558 2000 87.4% chr6 + 31503304 31503576 273 browser details YourSeq 147 298 558 2000 85.2% chrX + 40250315 40653690 403376 browser details YourSeq 145 306 569 2000 86.7% chrX + 59654874 59655190 317 browser details YourSeq 145 299 570 2000 87.1% chr1 + 92809786 92810093 308 browser details YourSeq 144 288 571 2000 88.3% chr9 + 103099548 103099859 312 browser details YourSeq 140 288 558 2000 82.4% chr12 + 8532223 8532552 330 browser details YourSeq 139 4 260 2000 80.5% chr1 - 35219719 35219947 229 browser details YourSeq 135 288 558 2000 80.6% chr10 + 93102657 93102902 246 browser details YourSeq 133 288 558 2000 85.1% chr9 - 121570043 121570312 270 browser details YourSeq 132 343 569 2000 87.3% chr4 - 8264335 8264601 267 browser details YourSeq 132 313 569 2000 83.6% chr5 + 107388845 107389151 307

Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1358 1 1358 1358 100.0% chr11 - 116441555 116442912 1358 browser details YourSeq 24 177 200 1358 100.0% chr14 - 13353136 13353159 24 browser details YourSeq 22 687 710 1358 87.0% chr12 - 81230080 81230102 23 browser details YourSeq 22 188 219 1358 84.4% chr6 + 40069060 40069091 32 browser details YourSeq 21 686 706 1358 100.0% chr1 - 132155709 132155729 21 browser details YourSeq 21 845 865 1358 100.0% chr11 + 120800065 120800085 21 browser details YourSeq 20 687 706 1358 100.0% chr10 - 25018724 25018743 20 browser details YourSeq 20 977 996 1358 100.0% chr1 - 138119738 138119757 20 browser details YourSeq 20 686 705 1358 100.0% chr11 + 107689637 107689656 20

Note: The 1358 bp section downstream of Exon 13 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Qrich2 glutamine rich 2 [ Mus musculus (house mouse) ] Gene ID: 217341, updated on 10-Oct-2019

Gene summary

Official Symbol Qrich2 provided by MGI Official Full Name glutamine rich 2 provided by MGI Primary source MGI:MGI:2684912 See related Ensembl:ENSMUSG00000070331 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gm66 Expression Restricted expression toward testis adult (RPKM 124.1) See more Orthologs all

Genomic context

Location: 11; 11 E2 See Qrich2 in Genome Data Viewer Exon count: 22

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (116441323..116466266, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (116302639..116315661, complement)

Chromosome 11 - NC_000077.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Qrich2 ENSMUSG00000070331

Description glutamine rich 2 [Source:MGI Symbol;Acc:MGI:2684912] Gene Synonyms LOC217341 Location Chromosome 11: 116,441,325-116,466,241 reverse strand. GRCm38:CM001004.2 About this gene This gene has 4 transcripts (splice variants), 139 orthologues, 3 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Qrich2-201 ENSMUST00000093909.11 2156 592aa ENSMUSP00000091437.3 Protein coding CCDS25667 Q3V2A7 TSL:1 GENCODE basic APPRIS P2

Qrich2-204 ENSMUST00000208602.1 7260 2337aa ENSMUSP00000147009.1 Protein coding - A0A140LIY9 TSL:5 GENCODE basic APPRIS ALT2

Qrich2-202 ENSMUST00000134182.2 2162 No protein - Retained intron - - TSL:1

Qrich2-203 ENSMUST00000140697.1 797 No protein - Retained intron - - TSL:5

44.92 kb Forward strand 116.44Mb 116.45Mb 116.46Mb 116.47Mb Ubald2-201 >protein coding Gm11739-201 >lncRNA (Comprehensive set...

Ubald2-202 >lncRNA

Contigs AL645861.12 > AL645851.8 > Genes (Comprehensive set... < Qrich2-202retained intron < Prpsap1-201protein coding

< Qrich2-204protein coding < Prpsap1-209lncRNA

< Qrich2-201protein coding < Prpsap1-204lncRNA

< Qrich2-203retained intron < Prpsap1-207lncRNA

< Prpsap1-205lncRNA

Regulatory Build

116.44Mb 116.45Mb 116.46Mb 116.47Mb Reverse strand 44.92 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000093909

< Qrich2-201protein coding

Reverse strand 13.02 kb

ENSMUSP00000091... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam Protein of unknown function DUF4795

PANTHER PTHR46766

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

inframe deletion missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 592

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8