https://www.alphaknockout.com

Mouse Phc3 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Phc3 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Phc3 (NCBI Reference Sequence: NM_153421 ; Ensembl: ENSMUSG00000037652 ) is located on Mouse 3. 15 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 15 (Transcript: ENSMUST00000168645). Exon 8 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Phc3 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-146F20 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 8 starts from about 29.73% of the coding region. The knockout of Exon 8 will result in frameshift of the gene. The size of intron 7 for 5'-loxP site insertion: 5252 bp, and the size of intron 8 for 3'-loxP site insertion: 4392 bp. The size of effective cKO region: ~1372 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 8 15 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Phc3 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7872bp) | A(26.63% 2096) | C(19.58% 1541) | T(33.7% 2653) | G(20.1% 1582)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 - 30937407 30940406 3000 browser details YourSeq 127 1323 1534 3000 85.4% chr12 - 111253524 111253712 189 browser details YourSeq 125 1363 1539 3000 87.5% chr5 + 98959586 98959750 165 browser details YourSeq 122 1379 1539 3000 91.7% chr17 - 15560878 15561043 166 browser details YourSeq 118 1388 1883 3000 82.4% chr2 - 155291641 155292049 409 browser details YourSeq 115 1386 1539 3000 91.9% chr1 - 12999131 12999418 288 browser details YourSeq 114 1393 1534 3000 92.0% chr12 - 119256990 119257136 147 browser details YourSeq 113 1401 1534 3000 93.8% chr17 - 72655501 72655639 139 browser details YourSeq 109 1390 1538 3000 93.0% chr3 - 35412513 35412666 154 browser details YourSeq 109 1363 1516 3000 84.1% chr13 + 97374853 97374985 133 browser details YourSeq 108 1363 1514 3000 84.8% chrX - 144332551 144332676 126 browser details YourSeq 108 1383 1512 3000 91.2% chr12 - 75479163 75479290 128 browser details YourSeq 106 1402 1734 3000 83.4% chr17 + 88660820 88661080 261 browser details YourSeq 105 1388 1511 3000 89.2% chr1 - 13041758 13041877 120 browser details YourSeq 105 1393 1539 3000 92.8% chrX + 53827831 53827997 167 browser details YourSeq 105 1398 1540 3000 92.0% chr3 + 82737382 82737526 145 browser details YourSeq 104 1363 1511 3000 84.6% chr11 + 89004433 89004560 128 browser details YourSeq 102 1388 1512 3000 88.3% chr4 - 48481896 48482015 120 browser details YourSeq 102 1387 1511 3000 88.7% chr9 + 79817586 79817708 123 browser details YourSeq 101 1399 1534 3000 92.5% chr8 + 3257692 3257832 141

Note: The 3000 bp section upstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr3 - 30933035 30936034 3000 browser details YourSeq 38 1111 1157 3000 86.4% chr7 - 123271022 123271066 45 browser details YourSeq 34 1080 1125 3000 85.8% chr12 - 99837673 99837717 45 browser details YourSeq 26 1088 1138 3000 76.5% chr11 + 116470600 116470651 52 browser details YourSeq 25 1511 1537 3000 96.3% chr12 - 89045711 89045737 27 browser details YourSeq 20 2833 2852 3000 100.0% chr8 - 109670283 109670302 20 browser details YourSeq 20 1102 1129 3000 85.8% chr13 + 44685300 44685327 28

Note: The 3000 bp section downstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Phc3 polyhomeotic 3 [ Mus musculus (house mouse) ] Gene ID: 241915, updated on 10-Oct-2019

Gene summary

Official Symbol Phc3 provided by MGI Official Full Name polyhomeotic 3 provided by MGI Primary source MGI:MGI:2181434 See related Ensembl:ENSMUSG00000037652 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Edr3; Hph3; E030046K01Rik Expression Ubiquitous expression in thymus adult (RPKM 7.9), genital fat pad adult (RPKM 7.0) and 28 other tissues See more Orthologs human all

Genomic context

Location: 3; 3 A3 See Phc3 in Genome Data Viewer

Exon count: 17

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 3 NC_000069.6 (30899295..30969479, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 3 NC_000069.5 (30798217..30868337, complement)

Chromosome 3 - NC_000069.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 10 transcripts

Gene: Phc3 ENSMUSG00000037652

Description polyhomeotic 3 [Source:MGI Symbol;Acc:MGI:2181434] Gene Synonyms E030046K01Rik, EDR3, HPH3 Location : 30,899,371-30,969,415 reverse strand. GRCm38:CM000996.2 About this gene This gene has 10 transcripts (splice variants), 239 orthologues, 16 paralogues, is a member of 1 Ensembl protein family and is associated with 4 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Phc3- ENSMUST00000168645.7 10976 981aa ENSMUSP00000130142.1 Protein coding CCDS17288 Q8CHP6 TSL:1 209 GENCODE basic APPRIS P3

Phc3- ENSMUST00000108255.7 3468 948aa ENSMUSP00000103890.1 Protein coding CCDS50884 D3YY34 TSL:1 204 GENCODE basic APPRIS ALT2

Phc3- ENSMUST00000129817.8 3019 981aa ENSMUSP00000114916.2 Protein coding CCDS17288 Q8CHP6 TSL:1 206 GENCODE basic APPRIS P3

Phc3- ENSMUST00000177992.7 2920 948aa ENSMUSP00000136820.1 Protein coding CCDS50884 D3YY34 TSL:5 210 GENCODE basic APPRIS ALT2

Phc3- ENSMUST00000064718.11 2893 951aa ENSMUSP00000065617.5 Protein coding CCDS50885 B7ZNA5 TSL:1 202 GENCODE basic APPRIS ALT2

Phc3- ENSMUST00000099163.4 2808 715aa ENSMUSP00000096767.3 Protein coding - E9QPT4 TSL:1 203 GENCODE basic

Phc3- ENSMUST00000046624.11 3491 595aa ENSMUSP00000037862.5 Nonsense mediated - F8WIN6 TSL:5 201 decay

Phc3- ENSMUST00000152357.7 3155 232aa ENSMUSP00000117614.1 Nonsense mediated - D6REI6 TSL:1 208 decay

Phc3- ENSMUST00000150939.2 3660 No - Retained intron - - TSL:3 207 protein

Phc3- ENSMUST00000124472.1 2958 No - Retained intron - - TSL:1 205 protein

Page 6 of 8 https://www.alphaknockout.com

90.05 kb Forward strand

30.90Mb 30.92Mb 30.94Mb 30.96Mb Gpr160-207 >protein coding Gm2979-201 >processed pseudogene (Comprehensive set...

Gpr160-203 >protein coding

Gpr160-204 >protein coding

Gpr160-201 >protein coding

Gpr160-208 >protein coding

Gpr160-202 >protein coding

Gpr160-206 >protein coding

Contigs AC111093.10 > Genes (Comprehensive set... < Phc3-209protein coding

< Phc3-201nonsense mediated decay

< Phc3-204protein coding

< Phc3-208nonsense mediated decay

< Phc3-206protein coding

< Phc3-210protein coding

< Phc3-202protein coding

< Phc3-205retained intron < 9530022L04Rik-201TEC

< Phc3-203protein coding

< Phc3-207retained intron

Regulatory Build

30.90Mb 30.92Mb 30.94Mb 30.96Mb Reverse strand 90.05 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000168645

< Phc3-209protein coding

Reverse strand 70.03 kb

ENSMUSP00000130... MobiDB lite Low complexity (Seg) Superfamily Sterile alpha motif/pointed domain superfamily

SMART Sterile alpha motif domain Pfam Sterile alpha motif domain

PROSITE profiles Zinc finger, FCS-type

Sterile alpha motif domain PANTHER PTHR12247:SF88

PTHR12247 Gene3D Sterile alpha motif/pointed domain superfamily

FCS-type zinc finger superfamily CDD cd09577

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

splice acceptor variant inframe deletion missense variant splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 981

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8