https://www.alphaknockout.com

Mouse Gcm2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Gcm2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Gcm2 (NCBI Reference Sequence: NM_008104 ; Ensembl: ENSMUSG00000021362 ) is located on Mouse 13. 5 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 5 (Transcript: ENSMUST00000021791). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Gcm2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-126N18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous null mice lack parathyroid glands and exhibit hypocalcemia, hypophosphatemia, a mild abnormal bone phenotype, and partial perinatal lethality. Hypoparathyroidism is observed although parathyroid hormone serum levels are normal.

Exon 5 covers 61.51% of the coding region. Start codon is in exon 1, and stop codon is in exon 5. The size of intron 4 for 5'-loxP site insertion: 879 bp. The size of effective cKO region: ~1203 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

gRNA region

Wildtype allele T A

5' gRNA region G 3'

1 2 3 4 5

Targeting vector T A G

Targeted allele T A G

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Gcm2 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7430bp) | A(27.58% 2049) | C(22.85% 1698) | T(26.43% 1964) | G(23.14% 1719)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr13 - 41103940 41106939 3000 browser details YourSeq 77 1120 1271 3000 89.7% chr9 + 78059657 78059808 152 browser details YourSeq 60 4 88 3000 94.2% chr14 + 60132410 60132496 87 browser details YourSeq 59 2 79 3000 90.5% chr1 + 130378375 130378453 79 browser details YourSeq 52 7 60 3000 98.2% chr4 - 43355006 43355059 54 browser details YourSeq 52 1736 1994 3000 68.2% chr11 + 5761272 5761426 155 browser details YourSeq 51 2 60 3000 93.3% chr16 - 52310259 52310317 59 browser details YourSeq 48 1926 2004 3000 77.5% chr12 + 108344629 108344696 68 browser details YourSeq 47 1960 2086 3000 94.6% chr14 - 52798199 52798633 435 browser details YourSeq 45 1961 2068 3000 89.4% chr10 + 60024281 60024385 105 browser details YourSeq 44 1971 2077 3000 95.9% chr7 - 107276536 107276643 108 browser details YourSeq 41 20 60 3000 100.0% chr2 - 27410516 27410556 41 browser details YourSeq 41 29 90 3000 90.2% chr19 + 7490912 7490974 63 browser details YourSeq 38 1971 2068 3000 69.4% chrX - 73847520 73847617 98 browser details YourSeq 38 1978 2078 3000 93.1% chr10 - 5241365 5241466 102 browser details YourSeq 38 1974 2081 3000 93.2% chr5 + 108220323 108220430 108 browser details YourSeq 36 1975 2076 3000 94.9% chr10 + 95497961 95498062 102 browser details YourSeq 35 1961 2005 3000 88.9% chr2 - 32902867 32902911 45 browser details YourSeq 35 1957 2077 3000 83.0% chr16 - 18236985 18237102 118 browser details YourSeq 35 1961 1997 3000 97.3% chr1 - 190789741 190789777 37

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr13 - 41099760 41102759 3000 browser details YourSeq 194 2367 2760 3000 81.9% chr9 + 56069148 56069506 359 browser details YourSeq 193 2367 2731 3000 81.0% chr18 + 32041254 32041590 337 browser details YourSeq 188 2367 2749 3000 84.2% chr10 - 71392848 71393211 364 browser details YourSeq 184 2376 2751 3000 81.8% chr1 - 87840020 87840366 347 browser details YourSeq 183 2417 2747 3000 86.3% chr1 - 136819287 136819785 499 browser details YourSeq 183 2375 2751 3000 81.5% chr13 + 59751026 59751372 347 browser details YourSeq 178 2365 2729 3000 78.7% chr9 + 70928847 70929172 326 browser details YourSeq 175 2399 2751 3000 83.2% chr18 - 11766306 11766599 294 browser details YourSeq 173 2420 2751 3000 81.9% chr12 - 41202573 41202887 315 browser details YourSeq 172 2363 2756 3000 81.8% chr1 + 181312749 181313126 378 browser details YourSeq 171 2381 2755 3000 84.4% chr13 + 114692691 114693067 377 browser details YourSeq 169 2375 2751 3000 80.5% chr2 + 5184688 5185040 353 browser details YourSeq 168 2375 2748 3000 78.0% chr2 + 69595808 69596133 326 browser details YourSeq 168 2366 2749 3000 84.7% chr15 + 76983194 77184678 201485 browser details YourSeq 167 2402 2751 3000 78.7% chrX + 60395821 60396160 340 browser details YourSeq 166 2427 2751 3000 83.5% chr5 - 116184431 116184711 281 browser details YourSeq 166 2461 2751 3000 84.4% chr16 + 97965847 97966130 284 browser details YourSeq 165 2375 2751 3000 82.8% chr11 - 105098801 105099138 338 browser details YourSeq 165 2367 2731 3000 82.0% chr10 - 43348525 43348877 353

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Gcm2 glial cells missing homolog 2 [ Mus musculus (house mouse) ] Gene ID: 107889, updated on 10-Oct-2019

Gene summary

Official Symbol Gcm2 provided by MGI Official Full Name glial cells missing homolog 2 provided by MGI Primary source MGI:MGI:1861438 See related Ensembl:ENSMUSG00000021362 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gcm1; Gcm-rs1; Gcm1-rs2 Expression Low expression observed in reference dataset See more Orthologs human all

Genomic context

Location: 13; 13 A3.3 See Gcm2 in Genome Data Viewer

Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 13 NC_000079.6 (41101427..41111035, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 13 NC_000079.5 (41196796..41205357, complement)

Chromosome 13 - NC_000079.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Gcm2 ENSMUSG00000021362

Description glial cells missing homolog 2 [Source:MGI Symbol;Acc:MGI:1861438] Gene Synonyms Gcm1-rs2 Location Chromosome 13: 41,101,427-41,111,035 reverse strand. GRCm38:CM001006.2 About this gene This gene has 3 transcripts (splice variants), 193 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 21 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Gcm2-201 ENSMUST00000021791.7 3073 504aa ENSMUSP00000021791.6 Protein coding CCDS26472 A0A0R4J021 TSL:1 GENCODE basic APPRIS P1

Gcm2-202 ENSMUST00000225271.1 688 183aa ENSMUSP00000153244.1 Protein coding - A0A286YD23 CDS 3' incomplete

Gcm2-203 ENSMUST00000225420.1 2662 No protein - Retained intron - - -

29.61 kb Forward strand 41.10Mb 41.11Mb 41.12Mb Gm48344-201 >processed pseudogene Sycp2l-201 >protein coding (Comprehensive set...

Contigs < AC158538.2 Genes (Comprehensive set... < Gcm2-203retained intron

< Gcm2-201protein coding

< Gcm2-202protein coding

Regulatory Build

41.10Mb 41.11Mb 41.12Mb Reverse strand 29.61 kb

Regulation Legend

CTCF Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

pseudogene processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000021791

< Gcm2-201protein coding

Reverse strand 8.76 kb

ENSMUSP00000021... MobiDB lite Superfamily GCM domain superfamily Pfam Transcription regulator GCM domain PROSITE profiles Transcription regulator GCM domain PANTHER PTHR12414:SF7

Chorion-specific transcription factor GCM Gene3D 2.20.28.80

3.30.1370.90

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant stop retained variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 504

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7