https://www.alphaknockout.com

Mouse Reep2 Knockout Project (CRISPR/Cas9)

Objective: To create a Reep2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Reep2 (NCBI Reference Sequence: NM_144865 ; Ensembl: ENSMUSG00000038555 ) is located on Mouse 18. 8 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 8 (Transcript: ENSMUST00000043484). Exon 1~8 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 1 starts from about 0.13% of the coding region. Exon 1~8 covers 100.0% of the coding region. The size of effective KO region: ~5711 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8

Legends Exon of mouse Reep2 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.1% 522) | C(22.85% 457) | T(25.15% 503) | G(25.9% 518)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.0% 520) | C(25.6% 512) | T(19.9% 398) | G(28.5% 570)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr18 + 34838757 34840756 2000 browser details YourSeq 135 1 158 2000 96.0% chr16 + 48369589 48369752 164 browser details YourSeq 32 734 806 2000 66.7% chr1 - 65478532 65478576 45 browser details YourSeq 32 1910 1951 2000 97.2% chr11 + 102206715 102206757 43 browser details YourSeq 31 1196 1263 2000 61.8% chr8 + 25559728 25559761 34 browser details YourSeq 26 109 140 2000 96.5% chr13 - 58992174 58992206 33 browser details YourSeq 25 1193 1225 2000 87.9% chr8 + 34303514 34303546 33 browser details YourSeq 23 1193 1220 2000 92.6% chr2 - 106441999 106442031 33 browser details YourSeq 23 1193 1216 2000 100.0% chr9 + 85695991 85696018 28 browser details YourSeq 23 616 638 2000 100.0% chr15 + 44398361 44398383 23 browser details YourSeq 22 161 182 2000 100.0% chr9 + 97590360 97590381 22 browser details YourSeq 22 427 448 2000 100.0% chr1 + 119173097 119173118 22 browser details YourSeq 21 1032 1052 2000 100.0% chr17 - 34426311 34426331 21 browser details YourSeq 21 1198 1218 2000 100.0% chr16 - 39843365 39843385 21 browser details YourSeq 21 1198 1218 2000 100.0% chr1 - 63074080 63074100 21 browser details YourSeq 21 424 446 2000 95.7% chr1 - 57953541 57953563 23 browser details YourSeq 21 1031 1051 2000 100.0% chr5 + 100570766 100570786 21 browser details YourSeq 20 1479 1498 2000 100.0% chr1 - 23872105 23872124 20 browser details YourSeq 20 1195 1218 2000 91.7% chr13 + 88741106 88741129 24

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr18 + 34846468 34848467 2000 browser details YourSeq 129 1803 1977 2000 88.8% chr13 - 87007464 87007636 173 browser details YourSeq 128 1815 1977 2000 88.9% chr3 - 110341671 110341829 159 browser details YourSeq 124 1807 1977 2000 84.7% chr2 - 7786074 7786234 161 browser details YourSeq 121 1805 1957 2000 92.4% chr5 + 113697026 113697178 153 browser details YourSeq 119 1741 1923 2000 91.1% chr1 - 165169037 165169549 513 browser details YourSeq 118 1801 1940 2000 92.9% chr4 - 130517974 130518117 144 browser details YourSeq 113 1805 1930 2000 95.3% chr4 + 151987331 151987604 274 browser details YourSeq 113 1814 1977 2000 83.6% chr10 + 35119020 35119175 156 browser details YourSeq 112 1805 1929 2000 95.2% chr6 + 95206682 95206808 127 browser details YourSeq 111 1801 1930 2000 93.8% chr11 - 53514886 53515016 131 browser details YourSeq 111 1805 1930 2000 94.5% chr10 - 75969549 75969676 128 browser details YourSeq 110 1800 1930 2000 92.4% chrX + 71370627 71370759 133 browser details YourSeq 108 1800 1930 2000 91.7% chr9 - 108335788 108335920 133 browser details YourSeq 106 1807 1949 2000 88.4% chr9 + 95730511 95730653 143 browser details YourSeq 106 1805 1930 2000 92.8% chr17 + 37030965 37031093 129 browser details YourSeq 106 1813 1930 2000 95.8% chr11 + 53467886 53468006 121 browser details YourSeq 105 1805 1929 2000 93.4% chr8 - 93062465 93062591 127 browser details YourSeq 105 1813 1930 2000 95.0% chr10 - 109984337 109984456 120 browser details YourSeq 104 1786 1930 2000 91.3% chr12 - 111033296 111033740 445

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Reep2 receptor accessory protein 2 [ Mus musculus (house mouse) ] Gene ID: 225362, updated on 12-Aug-2019

Gene summary

Official Symbol Reep2 provided by MGI Official Full Name receptor accessory protein 2 provided by MGI Primary source MGI:MGI:2385070 See related Ensembl:ENSMUSG00000038555 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Expression Biased expression in cerebellum adult (RPKM 71.0), testis adult (RPKM 55.4) and 9 other tissuesS ee more Orthologs human all

Genomic context

Location: 18; 18 B1 See Reep2 in Genome Data Viewer

Exon count: 8

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (34840589..34847463)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (35000312..35007109)

Chromosome 18 - NC_000084.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Reep2 ENSMUSG00000038555

Description receptor accessory protein 2 [Source:MGI Symbol;Acc:MGI:2385070] Location Chromosome 18: 34,840,589-34,847,463 forward strand. GRCm38:CM001011.2 About this gene This gene has 1 transcript (splice variant), 200 orthologues, 5 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Reep2-201 ENSMUST00000043484.7 1926 254aa ENSMUSP00000036065.7 Protein coding CCDS29135 Q8VCD6 TSL:1 GENCODE basic APPRIS P1

26.88 kb Forward strand 34.835Mb 34.840Mb 34.845Mb 34.850Mb 34.855Mb (Comprehensive set... Kdm3b-201 >protein coding Reep2-201 >protein coding

Kdm3b-206 >nonsense mediated decay

Kdm3b-202 >lncRNA

Contigs < AC114820.4 Regulatory Build

34.835Mb 34.840Mb 34.845Mb 34.850Mb 34.855Mb Reverse strand 26.88 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000043484

6.88 kb Forward strand

Reep2-201 >protein coding

ENSMUSP00000036... Transmembrane heli... MobiDB lite Low complexity (Seg) Pfam TB2/DP1/HVA22-related protein PANTHER TB2/DP1/HVA22-related protein

PTHR12300:SF29

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 254

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8