https://www.alphaknockout.com

Mouse Pcdhgc3 Knockout Project (CRISPR/Cas9)

Objective: To create a Pcdhgc3 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Pcdhgc3 (NCBI Reference Sequence: NM_033581 ; Ensembl: ENSMUSG00000102918 ) is located on Mouse 18. 4 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 4 (Transcript: ENSMUST00000076807). Exon 1 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 1 starts from the coding region. Exon 1 covers 86.72% of the coding region. The size of effective KO region: ~2430 bp. The KO region does not have any other known gene.

Page 1 of 10 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4

Legends Exon of mouse Pcdhgc3 Knockout region

Page 2 of 10 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section of Exon 1 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section of Exon 1 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 10 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.25% 445) | C(28.6% 572) | T(21.6% 432) | G(27.55% 551)

Note: The 2000 bp section of Exon 1 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.5% 450) | C(30.05% 601) | T(21.6% 432) | G(25.85% 517)

Note: The 2000 bp section of Exon 1 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 10 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr18 + 37806550 37808549 2000 browser details YourSeq 54 1031 1152 2000 72.2% chr18 + 37686414 37686535 122 browser details YourSeq 53 1236 1583 2000 57.8% chr18 + 37296201 37402626 106426 browser details YourSeq 34 1321 1583 2000 54.1% chr18 + 37321876 37322132 257 browser details YourSeq 31 372 406 2000 94.3% chr14 + 84446462 84446496 35 browser details YourSeq 30 1920 1957 2000 91.5% chr11 + 48882929 48883031 103 browser details YourSeq 24 1321 1583 2000 52.2% chr18 + 37413180 37413436 257 browser details YourSeq 22 108 129 2000 100.0% chr11 + 98316815 98316836 22 browser details YourSeq 22 1082 1106 2000 95.9% chr10 + 60369381 60369406 26 browser details YourSeq 20 237 256 2000 100.0% chr1 - 81752819 81752838 20

Note: The 2000 bp section of Exon 1 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr18 + 37806978 37808977 2000 browser details YourSeq 54 603 724 2000 72.2% chr18 + 37686414 37686535 122 browser details YourSeq 53 808 1155 2000 57.8% chr18 + 37296201 37402626 106426 browser details YourSeq 34 893 1155 2000 54.1% chr18 + 37321876 37322132 257 browser details YourSeq 30 1492 1529 2000 91.5% chr11 + 48882929 48883031 103 browser details YourSeq 24 893 1155 2000 52.2% chr18 + 37413180 37413436 257 browser details YourSeq 24 1642 1666 2000 100.0% chr10 + 17306641 17306675 35 browser details YourSeq 22 654 678 2000 95.9% chr10 + 60369381 60369406 26

Note: The 2000 bp section of Exon 1 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 10 https://www.alphaknockout.com

Gene and information: Pcdhgc3 protocadherin gamma subfamily C, 3 [ Mus musculus (house mouse) ] Gene ID: 93706, updated on 11-Sep-2019

Gene summary

Official Symbol Pcdhgc3 provided by MGI Official Full Name protocadherin gamma subfamily C, 3 provided by MGI Primary source MGI:MGI:1935201 See related Ensembl:ENSMUSG00000102918 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as PC43; Pcdh2 Expression Broad expression in adrenal adult (RPKM 57.2), frontal lobe adult (RPKM 54.5) and 24 other tissues See more Orthologs human all

Genomic context

Location: 18 B3; 18 19.67 cM See Pcdhgc3 in Genome Data Viewer Exon count: 4

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (37806410..37841873)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (37966064..38001527)

Chromosome 18 - NC_000084.6

Page 6 of 10 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Pcdhgc3 ENSMUSG00000102918

Description protocadherin gamma subfamily C, 3 [Source:MGI Symbol;Acc:MGI:1935201] Gene Synonyms PC43, Pcdh2 Location Chromosome 18: 37,806,364-37,841,873 forward strand. GRCm38:CM001011.2 About this gene This gene has 3 transcripts (splice variants), 43 orthologues, 69 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Pcdhgc3- ENSMUST00000076807.6 4687 934aa ENSMUSP00000076085.4 Protein coding CCDS29192 Q91XX1 TSL:1 201 GENCODE basic APPRIS P1

Pcdhgc3- ENSMUST00000192103.1 3588 78aa ENSMUSP00000141611.1 Nonsense mediated - A0A0A6YWM6 TSL:5 202 decay

Pcdhgc3- ENSMUST00000194447.1 3380 No - Retained intron - - TSL:NA 203 protein

55.51 kb Forward strand 37.80Mb 37.81Mb 37.82Mb 37.83Mb 37.84Mb 37.85Mb Pcdhgb6-201 >protein coding (Comprehensive set...

Pcdhga9-201 >protein coding

Pcdhgb8-201 >polymorphic pseudogene

Pcdhgb8-202 >protein coding

Pcdhga7-201 >protein coding

Pcdhga2-201 >protein coding

Pcdhga12-201 >protein coding

Pcdhgb2-202 >protein coding

Pcdhgb5-201 >protein coding

Pcdhga10-201 >protein coding

Pcdhga4-201 >protein coding

Pcdhga6-201 >protein coding

Gm42416-201 >protein coding

Pcdhgb4-202 >protein coding

Pcdhgb1-201 >protein coding

Pcdhga5-201 >protein coding

Gm37388-201 >protein coding

Pcdhgb7-201 >protein coding

Pcdhga3-201 >protein coding Page 7 of 10

Pcdhga11-201 >protein coding

Pcdhga8-201 >protein coding

Pcdhga1-201 >protein coding

Gm37013-201 >protein coding

Pcdhgc3-203 >retained intronPcdhgc4-201 >protein coding

Pcdhgc3-201 >protein coding

Pcdhgc3-202 >nonsense mediated decay

Pcdhgc4-203 >protein coding

Pcdhgc4-202 >retained introGnm38182-203 >lncRNA

Pcdhgc5-201 >protein coding

Pcdhgc5-203 >protein coding

Pcdhgc5-204 >protein coding

Pcdhgc5-202 >retained intron

Gm38182-201 >lncRNA

Gm38182-202 >lncRNA

Gm38182-204 >lncRNA

Gm38182-205 >lncRNA

Contigs AC020971.3 > AC129315.3 > Genes < BC037039-201lncRNA < Gm29994-201lncRNA < Diaph1-205protein coding (Comprehensive set...

< Diaph1-204protein coding

< Diaph1-201protein coding

< Diaph1-203protein coding

< Diaph1-202protein coding

< Diaph1-207lncRNA

Regulatory Build

37.80Mb 37.81Mb 37.82Mb 37.83Mb 37.84Mb 37.85Mb Reverse strand 55.51 kb

Regulation Legend CTCF Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript pseudogene RNA gene 55.51 kb Forward strand 37.80Mb 37.81Mb 37.82Mb 37.83Mb 37.84Mb 37.85Mb Genes Pcdhgb6-201 >protein coding (Comprehensive set...

Pcdhga9-201 >protein coding

Pcdhgb8-201 >polymorphic pseudogene

Pcdhgb8-202 >protein coding

Pcdhga7-201 >protein coding

Pcdhga2-201 >protein coding

Pcdhga12-201 >protein coding

Pcdhgb2-202 >protein coding

Pcdhgb5-201 >protein coding

Pcdhga10-201 >protein coding

Pcdhga4-201 >protein coding

Pcdhga6-201 >protein coding

Gm42416-201 >protein coding

Pcdhgb4-202 >protein coding

Pcdhgb1-201 >protein coding

Pcdhga5-201 >protein coding

Gm37388-201 >protein coding

Pcdhgb7-201 >protein coding https://www.alphaknockout.com

Pcdhga3-201 >protein coding

Pcdhga11-201 >protein coding

Pcdhga8-201 >protein coding

Pcdhga1-201 >protein coding

Gm37013-201 >protein coding

Pcdhgc3-203 >retained intronPcdhgc4-201 >protein coding

Pcdhgc3-201 >protein coding

Pcdhgc3-202 >nonsense mediated decay

Pcdhgc4-203 >protein coding

Pcdhgc4-202 >retained introGnm38182-203 >lncRNA

Pcdhgc5-201 >protein coding

Pcdhgc5-203 >protein coding

Pcdhgc5-204 >protein coding

Pcdhgc5-202 >retained intron

Gm38182-201 >lncRNA

Gm38182-202 >lncRNA

Gm38182-204 >lncRNA

Gm38182-205 >lncRNA

Contigs AC020971.3 > AC129315.3 > Genes < BC037039-201lncRNA < Gm29994-201lncRNA < Diaph1-205protein coding (Comprehensive set...

< Diaph1-204protein coding

< Diaph1-201protein coding

< Diaph1-203protein coding

< Diaph1-202protein coding

< Diaph1-207lncRNA

Regulatory Build

37.80Mb 37.81Mb 37.82Mb 37.83Mb 37.84Mb 37.85Mb Reverse strand 55.51 kb

Regulation Legend CTCF Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript pseudogene RNA gene Page 8 of 10 55.51 kb Forward strand 37.80Mb 37.81Mb 37.82Mb 37.83Mb 37.84Mb 37.85Mb Genes Pcdhgb6-201 >protein coding (Comprehensive set...

Pcdhga9-201 >protein coding

Pcdhgb8-201 >polymorphic pseudogene

Pcdhgb8-202 >protein coding

Pcdhga7-201 >protein coding

Pcdhga2-201 >protein coding

Pcdhga12-201 >protein coding

Pcdhgb2-202 >protein coding

Pcdhgb5-201 >protein coding

Pcdhga10-201 >protein coding

Pcdhga4-201 >protein coding

Pcdhga6-201 >protein coding

Gm42416-201 >protein coding

Pcdhgb4-202 >protein coding

Pcdhgb1-201 >protein coding

Pcdhga5-201 >protein coding

Gm37388-201 >protein coding

Pcdhgb7-201 >protein coding

Pcdhga3-201 >protein coding

Pcdhga11-201 >protein coding

Pcdhga8-201 >protein coding

Pcdhga1-201 >protein coding

Gm37013-201 >protein coding

Pcdhgc3-203 >retained intronPcdhgc4-201 >protein coding

Pcdhgc3-201 >protein coding

Pcdhgc3-202 >nonsense mediated decay

Pcdhgc4-203 >protein coding

Pcdhgc4-202 >retained introGnm38182-203 >lncRNA

Pcdhgc5-201 >protein coding

Pcdhgc5-203 >protein coding

Pcdhgc5-204 >protein coding

Pcdhgc5-202 >retained intron

Gm38182-201 >lncRNA

Gm38182-202 >lncRNA

Gm38182-204 >lncRNA

Gm38182-205 >lncRNA

Contigs AC020971.3 > AC129315.3 > Genes < BC037039-201lncRNA < Gm29994-201lncRNA < Diaph1-205protein coding (Comprehensive set...

< Diaph1-204protein coding

< Diaph1-201protein coding

< Diaph1-203protein coding

< Diaph1-202protein coding

< Diaph1-207lncRNA

Regulatory Build

37.80Mb 37.81Mb 37.82Mb 37.83Mb 37.84Mb 37.85Mb Reverse strand 55.51 kb

Regulation Legend CTCF Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript https://www.alphaknockout.com pseudogene RNA gene

Page 9 of 10 https://www.alphaknockout.com

Transcript: ENSMUST00000076807

35.46 kb Forward strand

Pcdhgc3-201 >protein coding

ENSMUSP00000076... Transmembrane heli... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Cleavage site (Sign... Superfamily Cadherin-like superfamily SMART Cadherin-like Prints Cadherin-like Pfam Cadherin-like Cadherin, C-terminal catenin-binding domain

Cadherin, N-terminal Cadherin, cytoplasmic C-terminal domain PROSITE profiles PS50268 PROSITE patterns Cadherin conserved site PANTHER Protocadherin gamma-C3

PTHR24028 Gene3D 2.60.40.60 CDD cd11304

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 720 800 934

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 10 of 10