https://www.alphaknockout.com

Mouse Pcdhac2 Knockout Project (CRISPR/Cas9)

Objective: To create a Pcdhac2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Pcdhac2 (NCBI Reference Sequence: NM_001003672 ; Ensembl: ENSMUSG00000102697 ) is located on Mouse 18. 4 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 4 (Transcript: ENSMUST00000047479). Exon 1 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a reporter allele are viable and fertile with no apparent gross phenotype.

Exon 1 starts from the coding region. Exon 1 covers 84.89% of the coding region. The size of effective KO region: ~2562 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4

Legends Exon of mouse Pcdhac2 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section of Exon 1 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section of Exon 1 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.95% 459) | C(27.7% 554) | T(21.3% 426) | G(28.05% 561)

Note: The 2000 bp section of Exon 1 is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.5% 510) | C(25.95% 519) | T(22.7% 454) | G(25.85% 517)

Note: The 2000 bp section of Exon 1 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr18 + 37143971 37145970 2000 browser details YourSeq 141 1044 1852 2000 68.0% chr18 + 37422622 37738931 316310 browser details YourSeq 74 1301 1615 2000 64.6% chr18 + 37695755 37696069 315 browser details YourSeq 43 1796 2000 2000 60.5% chr18 + 37402806 37403010 205 browser details YourSeq 41 1794 1995 2000 61.3% chr18 + 37335769 37335970 202 browser details YourSeq 41 1794 1854 2000 83.7% chr18 + 37423360 37423420 61 browser details YourSeq 39 1794 2000 2000 59.5% chr18 + 37322310 37322516 207 browser details YourSeq 32 54 104 2000 82.4% chr11 + 102202543 102202601 59 browser details YourSeq 26 45 77 2000 96.6% chr12 - 76572445 76572482 38 browser details YourSeq 26 64 104 2000 71.5% chr10 - 95053956 95053987 32 browser details YourSeq 25 64 103 2000 66.7% chr1 + 88503283 88503310 28 browser details YourSeq 25 733 764 2000 88.9% chr1 + 74134542 74134572 31 browser details YourSeq 24 410 441 2000 87.5% chr3 + 45379589 45379620 32 browser details YourSeq 24 1064 1087 2000 100.0% chr1 + 104975041 104975064 24 browser details YourSeq 23 88 112 2000 96.0% chr12 - 109432717 109432741 25 browser details YourSeq 22 1783 1804 2000 100.0% chr4 - 57224030 57224051 22 browser details YourSeq 22 1913 1935 2000 100.0% chr1 + 87577434 87577459 26 browser details YourSeq 20 1580 1599 2000 100.0% chr1 + 195262207 195262226 20

Note: The 2000 bp section of Exon 1 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr18 + 37144531 37146530 2000 browser details YourSeq 141 484 1292 2000 68.0% chr18 + 37422622 37738931 316310 browser details YourSeq 74 741 1055 2000 64.6% chr18 + 37695755 37696069 315 browser details YourSeq 45 1234 1443 2000 59.7% chr18 + 37335769 37335975 207 browser details YourSeq 43 1236 1438 2000 60.6% chr18 + 37402806 37403008 203 browser details YourSeq 41 1234 1294 2000 83.7% chr18 + 37423360 37423420 61 browser details YourSeq 39 1234 1438 2000 59.6% chr18 + 37322310 37322514 205 browser details YourSeq 24 504 527 2000 100.0% chr1 + 104975041 104975064 24 browser details YourSeq 22 1223 1244 2000 100.0% chr4 - 57224030 57224051 22 browser details YourSeq 22 1353 1375 2000 100.0% chr1 + 87577434 87577459 26 browser details YourSeq 20 1020 1039 2000 100.0% chr1 + 195262207 195262226 20

Note: The 2000 bp section of Exon 1 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Pcdhac2 protocadherin alpha subfamily C, 2 [ Mus musculus (house mouse) ] Gene ID: 353237, updated on 11-Sep-2019

Gene summary

Official Symbol Pcdhac2 provided by MGI Official Full Name protocadherin alpha subfamily C, 2 provided by MGI Primary source MGI:MGI:1891443 See related Ensembl:ENSMUSG00000102697 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as CNRc2 Expression Biased expression in CNS E18 (RPKM 21.2), cortex adult (RPKM 14.2) and 8 other tissues See more Orthologs human all

Genomic context

Location: 18; 18 B3 See Pcdhac2 in Genome Data Viewer Exon count: 4

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (37143678..37187663)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (37303623..37347311)

Chromosome 18 - NC_000084.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Pcdhac2 ENSMUSG00000102697

Description protocadherin alpha subfamily C, 2 [Source:MGI Symbol;Acc:MGI:1891443] Gene Synonyms CNRc2 Location Chromosome 18: 37,143,503-37,187,657 forward strand. GRCm38:CM001011.2 About this gene This gene has 1 transcript (splice variant), 220 orthologues, 69 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Pcdhac2-201 ENSMUST00000047479.2 5888 1006aa ENSMUSP00000039888.1 Protein coding CCDS29168 Q91Y09 TSL:1 GENCODE basic APPRIS P1

64.16 kb Forward strand

37.14Mb 37.15Mb 37.16Mb 37.17Mb 37.18Mb 37.19Mb Pcdha11-201 >protein coding (Comprehensive set...

Pcdha8-201 >protein coding

Pcdha12-201 >protein coding

Pcdha11-206 >retained intron

Pcdha11-202 >protein coding

Pcdha11-203 >protein coding

Pcdha11-204 >protein coding

Pcdhac1-201 >protein coding

Pcdha9-201 >protein coding

Pcdha7-201 >protein coding

Pcdha2-201 >protein coding

Pcdha2-202 >protein coding

Gm37388-201 >protein coding

Pcdha1-202 >protein coding

Pcdha1-201 >protein coding

Pcdha6-202 >protein coding

Pcdha6-201 >protein coding

Pcdha6-203 >nonsense mediated decay

Pcdha5-201 >protein coding

Gm42416-201 >protein coding

Pcdha3-201 >protein coding

Page 7 of 9 Gm37013-201 >protein coding

Pcdha4-202 >protein coding

Pcdha4-201 >protein coding

Pcdhac2-201 >protein coding

Contigs < AC020972.3 AC020973.3 > Genes < Gm10544-201lncRNA (Comprehensive set...

< Gm38097-201lncRNA

< Gm10544-202lncRNA

< Gm10544-203retained intron

Regulatory Build

37.14Mb 37.15Mb 37.16Mb 37.17Mb 37.18Mb 37.19Mb Reverse strand 64.16 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript 64.16 kb Forward strand 37.14Mb 37.15Mb 37.16Mb 37.17Mb 37.18Mb 37.19Mb Genes Pcdha11-201 >protein coding (Comprehensive set...

Pcdha8-201 >protein coding

Pcdha12-201 >protein coding

Pcdha11-206 >retained intron

Pcdha11-202 >protein coding

Pcdha11-203 >protein coding

Pcdha11-204 >protein coding

Pcdhac1-201 >protein coding

Pcdha9-201 >protein coding

Pcdha7-201 >protein coding

Pcdha2-201 >protein coding

Pcdha2-202 >protein coding

Gm37388-201 >protein coding

Pcdha1-202 >protein coding

Pcdha1-201 >protein coding

Pcdha6-202 >protein coding

Pcdha6-201 >protein coding

Pcdha6-203 >nonsense mediated decay

Pcdha5-201 >protein coding

Gm42416-201 >protein coding

https://www.alphaknockout.com Pcdha3-201 >protein coding

Gm37013-201 >protein coding

Pcdha4-202 >protein coding

Pcdha4-201 >protein coding

Pcdhac2-201 >protein coding

Contigs < AC020972.3 AC020973.3 > Genes < Gm10544-201lncRNA (Comprehensive set...

< Gm38097-201lncRNA

< Gm10544-202lncRNA

< Gm10544-203retained intron

Regulatory Build

37.14Mb 37.15Mb 37.16Mb 37.17Mb 37.18Mb 37.19Mb Reverse strand 64.16 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000047479

44.16 kb Forward strand

Pcdhac2-201 >protein coding

ENSMUSP00000039... Transmembrane heli... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Cleavage site (Sign... Superfamily Cadherin-like superfamily SMART Cadherin-like Prints Cadherin-like Pfam Cadherin-like Cadherin, cytoplasmic C-terminal domain

Cadherin, N-terminal Cadherin, C-terminal catenin-binding domain PROSITE profiles PS50268 PROSITE patterns Cadherin conserved site PANTHER PTHR24028

Protocadherin alpha-C2 Gene3D 2.60.40.60 CDD cd11304

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1006

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9