https://www.alphaknockout.com

Mouse Pcdh11x Knockout Project (CRISPR/Cas9)

Objective: To create a Pcdh11x knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Pcdh11x (NCBI Reference Sequence: NM_001271810 ; Ensembl: ENSMUSG00000034755 ) is located on Mouse X. 6 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 6 (Transcript: ENSMUST00000113358). Exon 2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from the coding region. Exon 2 covers 13.64% of the coding region. The size of effective KO region: ~584 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 6

Legends Exon of mouse Pcdh11x Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(29.35% 587) | C(17.2% 344) | T(39.5% 790) | G(13.95% 279)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(33.4% 668) | C(14.3% 286) | T(37.4% 748) | G(14.9% 298)

Note: The 2000 bp section downstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chrX + 120362831 120364830 2000 browser details YourSeq 90 52 597 2000 88.7% chr14 + 53142773 53143327 555 browser details YourSeq 84 357 648 2000 80.9% chr14 - 72321350 72321636 287 browser details YourSeq 84 368 646 2000 76.3% chrX + 96301570 96301816 247 browser details YourSeq 83 332 632 2000 79.4% chr1 - 22326641 22326932 292 browser details YourSeq 77 280 632 2000 80.5% chr1 - 45214257 45214588 332 browser details YourSeq 76 259 598 2000 75.0% chr10 + 55336641 55336954 314 browser details YourSeq 75 260 628 2000 81.2% chrX - 73245326 73245687 362 browser details YourSeq 75 373 632 2000 69.8% chrX + 112108030 112108276 247 browser details YourSeq 74 519 647 2000 87.9% chr12 + 35953769 35953899 131 browser details YourSeq 73 232 595 2000 76.7% chr10 + 18933854 18934205 352 browser details YourSeq 70 326 625 2000 74.7% chr12 + 59478980 59479279 300 browser details YourSeq 68 454 632 2000 84.4% chr19 - 48273156 48273330 175 browser details YourSeq 66 371 648 2000 78.2% chr12 + 70514346 70514616 271 browser details YourSeq 65 518 673 2000 81.6% chrX - 132231473 132231631 159 browser details YourSeq 64 518 648 2000 76.3% chrX - 6311387 6311520 134 browser details YourSeq 64 355 634 2000 78.8% chr18 - 64688245 64688507 263 browser details YourSeq 57 44 276 2000 87.2% chr10 - 23142466 23142707 242 browser details YourSeq 55 561 634 2000 87.9% chr12 - 63945057 63945192 136 browser details YourSeq 55 572 648 2000 85.8% chr7 + 137542861 137542937 77

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chrX + 120365371 120367370 2000 browser details YourSeq 34 1092 1135 2000 94.8% chr6 - 9411702 9411755 54 browser details YourSeq 34 1865 1917 2000 87.0% chr2 + 14343520 14343573 54 browser details YourSeq 32 1112 1145 2000 97.1% chr16 - 96871468 96871501 34 browser details YourSeq 32 1102 1142 2000 97.3% chr16 + 4409549 4409591 43 browser details YourSeq 26 716 744 2000 85.2% chr11 + 55198286 55198312 27 browser details YourSeq 25 543 568 2000 100.0% chr2 + 105763556 105763583 28 browser details YourSeq 23 1117 1139 2000 100.0% chr13 - 104451112 104451134 23

Note: The 2000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Pcdh11x protocadherin 11 X-linked [ Mus musculus (house mouse) ] Gene ID: 245578, updated on 1-Oct-2019

Gene summary

Official Symbol Pcdh11x provided by MGI Official Full Name protocadherin 11 X-linked provided by MGI Primary source MGI:MGI:2442849 See related Ensembl:ENSMUSG00000034755 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as PCDHX; Pcdh11; PCDHX11 Summary This gene encodes a member of the protocadherin family, and cadherin superfamily, of transmembrane containing Expression cadherin domains. The encoded protein may mediate cell-cell adhesion in neuronal tissues in the presence of calcium. Alternatively spliced transcript variants have been observed for this gene. [provided by RefSeq, Nov 2012] Orthologs Biased expression in CNS E18 (RPKM 1.6), whole brain E14.5 (RPKM 0.8) and 6 other tissues See more human all

Genomic context

Location: X; X E2 See Pcdh11x in Genome Data Viewer Exon count: 11

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) X NC_000086.7 (120290238..120910622)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) X NC_000086.6 (117403936..118020059)

Chromosome X - NC_000086.7

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Pcdh11x ENSMUSG00000034755

Description protocadherin 11 X-linked [Source:MGI Symbol;Acc:MGI:2442849] Gene Synonyms A230092L07Rik, PCDHX Location Chromosome X: 120,290,259-120,910,619 forward strand. GRCm38:CM001013.2 About this gene This gene has 9 transcripts (splice variants), 188 orthologues, 33 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Pcdh11x- ENSMUST00000113358.9 8607 1320aa ENSMUSP00000108985.3 Protein coding CCDS72426 B1AZR7 TSL:5 202 GENCODE basic

Pcdh11x- ENSMUST00000192677.5 9097 1338aa ENSMUSP00000141522.1 Protein coding - F6ZNL5 TSL:5 206 GENCODE basic APPRIS P5

Pcdh11x- ENSMUST00000193899.1 5394 1026aa ENSMUSP00000141581.1 Protein coding - A0A0A6YWK0 TSL:5 208 GENCODE basic APPRIS ALT2

Pcdh11x- ENSMUST00000113364.9 4454 1338aa ENSMUSP00000108991.4 Protein coding - F6ZNL5 TSL:5 203 GENCODE basic APPRIS P5

Pcdh11x- ENSMUST00000050239.15 4442 1334aa ENSMUSP00000052340.9 Protein coding - E9Q622 TSL:5 201 GENCODE basic

Pcdh11x- ENSMUST00000191653.1 1491 496aa ENSMUSP00000141600.1 Protein coding - Q2TJH7 CDS 5' 205 incomplete TSL:1

Pcdh11x- ENSMUST00000195088.5 4402 1128aa ENSMUSP00000142050.1 Nonsense mediated - A0JNT1 TSL:1 209 decay

Pcdh11x- ENSMUST00000155223.6 1635 304aa ENSMUSP00000138407.1 Nonsense mediated - Q2TJH9 CDS 5' 204 decay incomplete TSL:1

Pcdh11x- ENSMUST00000192977.1 2145 No - Retained intron - - TSL:NA 207 protein

Page 7 of 9 https://www.alphaknockout.com

640.36 kb Forward strand

Genes (Comprehensive set... Pcdh11x-201 >protein coding

Pcdh11x-203 >protein coding

Pcdh11x-202 >protein coding

Pcdh11x-207 >retained intron Pcdh11x-205 >protein coding

Pcdh11x-209 >nonsense mediated decay

Pcdh11x-206 >protein coding

Pcdh11x-208 >protein coding

Gm14927-201 >processed pseudogene

Pcdh11x-204 >nonsense mediated decay

Contigs BX005446.8 > < AC115298.6 < AC117258.2 BX005240.7 >

Genes < H2afb3-201protein coding < Gm14930-201processed pseudogene (Comprehensive set...

< Gm14931-201processed pseudogene

Regulatory Build

Reverse strand 640.36 kb

Regulation Legend

CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript pseudogene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000113358

620.29 kb Forward strand

Pcdh11x-202 >protein coding

ENSMUSP00000108... MobiDB lite Low complexity (Seg) Cleavage site (Sign... Superfamily Cadherin-like superfamily SMART Cadherin-like Prints Cadherin-like Pfam Cadherin-like Protocadherin

Cadherin, N-terminal PROSITE profiles PS50268 PROSITE patterns Cadherin conserved site PANTHER PTHR24028:SF254

PTHR24028 Gene3D 2.60.40.60 CDD cd11304

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1320

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9