https://www.alphaknockout.com

Mouse Chm Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Chm conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Chm (NCBI Reference Sequence: NM_018818 ; Ensembl: ENSMUSG00000025531 ) is located on Mouse X. 15 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 15 (Transcript: ENSMUST00000026607). Exon 4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Chm gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-275G16 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: For one disruption of this gene, heterozygous female and hemizygous male null mice display embryonic lethality with abnormal extraembryonic tissue development. For other disruptions however, heterozygous mice do survive and display depigmentation and degeneration of the retina.

Exon 4 starts from about 9.57% of the coding region. The knockout of Exon 4 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 1253 bp, and the size of intron 4 for 3'-loxP site insertion: 22905 bp. The size of effective cKO region: ~625 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 4 15 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Chm Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7125bp) | A(28.11% 2003) | C(17.04% 1214) | T(35.99% 2564) | G(18.86% 1344)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chrX - 113139711 113142710 3000 browser details YourSeq 49 2848 2971 3000 79.5% chr12 + 25422245 25422364 120 browser details YourSeq 41 2926 2976 3000 90.2% chrX - 56617103 56617153 51 browser details YourSeq 41 2937 3000 3000 81.7% chr14 - 53369551 53369610 60 browser details YourSeq 41 2886 2969 3000 70.7% chr12 + 9988743 9988821 79 browser details YourSeq 40 2927 2972 3000 93.5% chr9 - 51742563 51742608 46 browser details YourSeq 39 2927 2971 3000 93.4% chr14 - 71263831 71263875 45 browser details YourSeq 39 2925 2979 3000 85.5% chr10 + 66957255 66957309 55 browser details YourSeq 37 2929 2972 3000 95.2% chr11 - 40001109 40001152 44 browser details YourSeq 37 2926 2970 3000 91.2% chr8 + 43114879 43114923 45 browser details YourSeq 35 2925 2969 3000 88.9% chr1 - 132646834 132646878 45 browser details YourSeq 35 2930 2972 3000 90.7% chr12 + 65927682 65927724 43 browser details YourSeq 34 2926 2971 3000 87.0% chr15 - 62643310 62643355 46 browser details YourSeq 33 2755 2970 3000 50.0% chr18 - 16711874 16711911 38 browser details YourSeq 33 2926 2972 3000 85.2% chr1 + 20285298 20285344 47 browser details YourSeq 32 2936 2971 3000 94.5% chr11 - 42979221 42979256 36 browser details YourSeq 32 2934 2971 3000 92.2% chr10 - 82713930 82713967 38 browser details YourSeq 32 2934 2971 3000 91.2% chr15 + 63719565 63719601 37 browser details YourSeq 32 2921 2970 3000 82.0% chr13 + 65090283 65090332 50 browser details YourSeq 31 2910 2944 3000 94.3% chr6 - 27699144 27699178 35

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chrX - 113136086 113139085 3000 browser details YourSeq 179 1079 1291 3000 93.9% chr7 - 3244769 3244978 210 browser details YourSeq 176 1038 1279 3000 90.2% chr19 + 5397695 5397896 202 browser details YourSeq 172 1079 1268 3000 95.8% chr2 - 4653016 4653207 192 browser details YourSeq 171 1083 1267 3000 96.3% chr9 - 111031846 111032030 185 browser details YourSeq 171 1055 1266 3000 93.0% chr11 + 98162295 98162599 305 browser details YourSeq 170 1058 1280 3000 89.8% chr16 - 32308593 32308807 215 browser details YourSeq 169 1029 1253 3000 89.6% chr3 - 66665715 66665924 210 browser details YourSeq 169 1080 1266 3000 95.2% chr9 + 49212199 49212385 187 browser details YourSeq 169 1055 1266 3000 94.8% chr14 + 55534420 55534771 352 browser details YourSeq 168 1081 1267 3000 95.2% chr17 - 87217094 87217283 190 browser details YourSeq 168 1075 1275 3000 92.6% chr1 + 132247247 132247459 213 browser details YourSeq 166 1081 1264 3000 95.7% chr14 - 79670604 79670791 188 browser details YourSeq 166 1055 1263 3000 89.8% chr14 + 79294786 79294974 189 browser details YourSeq 166 1082 1268 3000 94.7% chr14 + 7811295 7811487 193 browser details YourSeq 166 1061 1265 3000 89.6% chr11 + 80418714 80418908 195 browser details YourSeq 165 1055 1265 3000 89.4% chr14 - 54422598 54422786 189 browser details YourSeq 165 1078 1271 3000 94.6% chrX + 102162642 102162934 293 browser details YourSeq 165 1059 1265 3000 89.0% chr13 + 55131842 55132033 192 browser details YourSeq 165 1078 1266 3000 94.7% chr13 + 46874033 46874233 201

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Chm choroidermia (RAB escort protein 1) [ Mus musculus (house mouse) ] Gene ID: 12662, updated on 24-Oct-2019

Gene summary

Official Symbol Chm provided by MGI Official Full Name choroidermia (RAB escort protein 1) provided by MGI Primary source MGI:MGI:892979 See related Ensembl:ENSMUSG00000025531 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Rep-1 Expression Broad expression in frontal lobe adult (RPKM 3.1), cerebellum adult (RPKM 2.6) and 24 other tissues See more Orthologs human all

Genomic context

Location: X; X E1 See Chm in Genome Data Viewer

Exon count: 18

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) X NC_000086.7 (113040592..113185539, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) X NC_000086.6 (110154201..110299124, complement)

Chromosome X - NC_000086.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Chm ENSMUSG00000025531

Description choroidermia (RAB escort protein 1) [Source:MGI Symbol;Acc:MGI:892979] Gene Synonyms Rep-1 Location Chromosome X: 113,040,593-113,185,517 reverse strand. GRCm38:CM001013.2 About this gene This gene has 5 transcripts (splice variants), 187 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 16 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Chm-201 ENSMUST00000026607.14 4868 662aa ENSMUSP00000026607.8 Protein coding CCDS41107 A2AD03 TSL:1 GENCODE basic APPRIS P1

Chm-202 ENSMUST00000113388.2 4749 623aa ENSMUSP00000109015.2 Protein coding - Q3UR39 TSL:1 GENCODE basic

Chm-204 ENSMUST00000135821.7 3386 No protein - lncRNA - - TSL:1

Chm-205 ENSMUST00000153417.1 706 No protein - lncRNA - - TSL:2

Chm-203 ENSMUST00000133469.1 347 No protein - lncRNA - - TSL:5

164.93 kb Forward strand 113.04Mb 113.06Mb 113.08Mb 113.10Mb 113.12Mb 113.14Mb 113.16Mb 113.18Mb Contigs AL669958.14 > (Comprehensive set... < Gm14939-201processed pseudogene < Mir361-201miRNA < Chm-203lncRNA

< Chm-201protein coding

< Chm-202protein coding

< Chm-205lncRNA < Chm-204lncRNA

Regulatory Build

113.04Mb 113.06Mb 113.08Mb 113.10Mb 113.12Mb 113.14Mb 113.16Mb 113.18Mb Reverse strand 164.93 kb

Regulation Legend

CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

pseudogene RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000026607

< Chm-201protein coding

Reverse strand 144.93 kb

ENSMUSP00000026... MobiDB lite Low complexity (Seg) Superfamily FAD/NAD(P)-binding domain superfamily

SSF54373 Prints Rab protein geranylgeranyltransferase component A

GDP dissociation inhibitor Pfam GDP dissociation inhibitor PIRSF Rab protein geranylgeranyltransferase component A PANTHER GDP dissociation inhibitor

PTHR11787:SF9 Gene3D 1.10.405.10

FAD/NAD(P)-binding domain superfamily

3.30.519.10

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe deletion missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 662

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7