http://www.alphaknockout.com/ Mouse Pcbp2 Knockout Project (CRISPR/Cas9)

Objective: To create a Pcbp2 knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Pcbp2 (NCBI Reference Sequence: NM_001103165 ; Ensembl: ENSMUSG00000056851 ) is located on Mouse 15. 15 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 15 (Transcript: ENSMUST00000077037). Exon 8~12 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis.

Note: Mice heterozygous for a knock-out allele exhibit decreased body weight, impaired erythroblast maturation and lowered mean platelet counts. Mice homozygous for this allele die between E12.5 and E15.5 with hemorrhage and edema.

Exon 8 starts from about 46.59% of the coding region. Exon 8~12 covers 29.65% of the coding region. The size of effective KO region: ~4649 bp.

The mouse Gm27406 will be affected by deletion of this KO region.

Page 1 of 10 http://www.alphaknockout.com/

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 8 9 10 11 12 15

Legends Exon of mouse Pcbp2 Knockout region

Page 2 of 10 http://www.alphaknockout.com/

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 877 bp section upstream of Exon 8 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 937 bp section downstream of Exon 12 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 10 http://www.alphaknockout.com/

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(877bp) | A(30.79% 270) | C(14.82% 130) | G(19.73% 173) | T(34.66% 304)

Note: The 877 bp section upstream of Exon 8 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(937bp) | A(29.24% 274) | C(17.18% 161) | G(20.6% 193) | T(32.98% 309)

Note: The 937 bp section downstream of Exon 12 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 10 http://www.alphaknockout.com/

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 877 1 877 877 100.0% chr15 + 102483290 102484166 877 browser details YourSeq 32 603 639 877 97.3% chr9 + 111345874 111346018 145 browser details YourSeq 31 529 620 877 97.0% chr1 - 192816555 192816666 112 browser details YourSeq 24 754 786 877 80.8% chr1 + 62437534 62437563 30 browser details YourSeq 23 655 677 877 100.0% chr2 - 93574824 93574846 23 browser details YourSeq 23 595 617 877 100.0% chr1 - 117117290 117117312 23 browser details YourSeq 22 596 617 877 100.0% chr10 - 42699736 42699757 22 browser details YourSeq 20 13 32 877 100.0% chr1 - 162524430 162524449 20 browser details YourSeq 20 627 646 877 100.0% chr10 + 91247885 91247904 20 browser details YourSeq 20 122 141 877 100.0% chr1 + 52218156 52218175 20

Note: The 877 bp section upstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 937 1 937 937 100.0% chr15 + 102488816 102489752 937 browser details YourSeq 27 114 156 937 90.0% chr9 + 23118321 23118362 42 browser details YourSeq 27 707 734 937 100.0% chr1 + 138329048 138329471 424 browser details YourSeq 26 177 205 937 96.5% chr1 - 5849243 5849566 324 browser details YourSeq 25 444 471 937 96.5% chr15 - 56140765 56140794 30 browser details YourSeq 25 20 46 937 100.0% chr10 - 65578416 65578447 32 browser details YourSeq 22 887 911 937 95.9% chr10 - 11292427 11292452 26 browser details YourSeq 22 870 895 937 92.4% chr6 + 132862516 132862541 26 browser details YourSeq 22 780 801 937 100.0% chr18 + 43034121 43034142 22 browser details YourSeq 22 333 354 937 100.0% chr11 + 46331197 46331218 22 browser details YourSeq 21 653 673 937 100.0% chrX - 166183606 166183626 21 browser details YourSeq 21 569 589 937 100.0% chr4 - 27471611 27471631 21 browser details YourSeq 21 886 906 937 100.0% chr12 - 10681227 10681247 21 browser details YourSeq 21 333 353 937 100.0% chr1 - 138268281 138268301 21 browser details YourSeq 21 861 881 937 100.0% chr1 + 107007796 107007816 21 browser details YourSeq 20 913 932 937 100.0% chr1 + 89786142 89786161 20

Note: The 937 bp section downstream of Exon 12 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 10 http://www.alphaknockout.com/ Gene and information: Pcbp2 poly(rC) binding protein 2 [ Mus musculus (house mouse) ] Gene ID: 18521, updated on 24-Oct-2019

Gene summary

Official Symbol Pcbp2 provided by MGI Official Full Name poly(rC) binding protein 2 provided by MGI Primary source MGI:MGI:108202 See related Ensembl:ENSMUSG00000056851 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Hnrpx; AW412548; alphaCP-2 Expression Ubiquitous expression in testis adult (RPKM 95.3), limb E14.5 (RPKM 65.4) and 28 other tissues See more Orthologs human all

Genomic context

Location: 15 F3; 15 57.61 cM See Pcbp2 in Genome Data Viewer Exon count: 18

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 15 NC_000081.6 (102470450..102500059)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 15 NC_000081.5 (102301063..102330490)

Chromosome 15 - NC_000081.6

Page 6 of 10 http://www.alphaknockout.com/

Transcript information: This gene has 31 transcripts

Gene: Pcbp2 ENSMUSG00000056851

Description poly(rC) binding protein 2 [Source:MGI Symbol;Acc:MGI:108202] Gene Synonyms Hnrpx, alphaCP-2 Location Chromosome 15: 102,470,539-102,500,061 forward strand. GRCm38:CM001008.2 About this gene This gene has 31 transcripts (splice variants), 260 orthologues, 12 paralogues, is a member of 1 Ensembl protein family and is associated with 6 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Pcbp2- ENSMUST00000078404.14 3222 362aa ENSMUSP00000077509.8 Protein CCDS49741 Q61990 TSL:2 202 coding GENCODE basic APPRIS ALT1

Pcbp2- ENSMUST00000077037.12 2952 362aa ENSMUSP00000076294.5 Protein CCDS49741 Q61990 TSL:1 201 coding GENCODE basic APPRIS ALT1

Pcbp2- ENSMUST00000229618.1 1576 331aa ENSMUSP00000155430.1 Protein CCDS49742 Q61990 GENCODE basic 212 coding APPRIS ALT1

Pcbp2- ENSMUST00000229854.1 1499 349aa ENSMUSP00000155038.1 Protein CCDS27884 Q61990 GENCODE basic 216 coding APPRIS P3

Pcbp2- ENSMUST00000108838.4 2571 322aa ENSMUSP00000104466.4 Protein - Q3TT81 TSL:2 203 coding GENCODE basic APPRIS ALT1

Pcbp2- ENSMUST00000229918.1 1463 335aa ENSMUSP00000155072.1 Protein - A0A2R8VHP9 GENCODE basic 217 coding APPRIS ALT1

Pcbp2- ENSMUST00000229802.1 1215 335aa ENSMUSP00000155292.1 Protein - A0A2R8VHP9 GENCODE basic 214 coding APPRIS ALT1

Pcbp2- ENSMUST00000229184.1 1090 316aa ENSMUSP00000155654.1 Protein - A0A2R8VI25 GENCODE basic 206 coding

Pcbp2- ENSMUST00000230114.1 1086 361aa ENSMUSP00000155789.1 Protein - B2M1R7 GENCODE basic 219 coding APPRIS ALT1

Pcbp2- ENSMUST00000231085.1 915 265aa ENSMUSP00000155431.1 Protein - A0A2R8W6U6 CDS 5' incomplete 230 coding

Pcbp2- ENSMUST00000229102.1 893 224aa ENSMUSP00000155096.1 Protein - A0A2R8W6L5 CDS 3' incomplete 205 coding

Pcbp2- ENSMUST00000230728.1 873 280aa ENSMUSP00000155631.1 Protein - A0A2R8VI12 CDS 3' incomplete 227 coding

Pcbp2- ENSMUST00000231089.1 843 249aa ENSMUSP00000155548.1 Protein - A0A2R8VHY9 CDS 3' incomplete 231 coding

Pcbp2- ENSMUST00000230577.1 792 264aa ENSMUSP00000155825.1 Protein - A0A2R8VI73 CDS 5' and 3' 224 coding incomplete

Pcbp2- ENSMUST00000229958.1 789 241aa ENSMUSP00000155784.1 Protein - A0A2R8VI71 CDS 3' incomplete 218 coding

Pcbp2- ENSMUST00000229746.1 786 205aa ENSMUSP00000155016.1 Protein - A0A2R8VJY1 CDS 3' incomplete 213 coding

Pcbp2- ENSMUST00000229219.1 759 253aa ENSMUSP00000155730.1 Protein - A0A2R8VKN0 CDS 5' and 3' 207 coding incomplete

Pcbp2- ENSMUST00000230211.1 704 190aa ENSMUSP00000155440.1 Protein - A0A2R8VKF0 CDS 3' incomplete 221 coding

Pcbp2- ENSMUST00000230918.1 693 231aa ENSMUSP00000155569.1 Protein - A0A2R8VHZ0 CDS 5' and 3' 228 coding incomplete

Pcbp2- ENSMUST00000230539.1 690 186aa ENSMUSP00000155051.1 Protein - A0A2R8VHG2 CDS 3' incomplete

Page 7 of 10 http://www.alphaknockout.com/

223 coding

Pcbp2- ENSMUST00000229533.1 674 225aa ENSMUSP00000155219.1 Protein - A0A2R8VHL8 CDS 5' and 3' 211 coding incomplete

Pcbp2- ENSMUST00000229061.1 669 184aa ENSMUSP00000154930.1 Protein - A0A2R8W6H3 CDS 5' incomplete 204 coding

Pcbp2- ENSMUST00000229222.1 647 168aa ENSMUSP00000155240.1 Protein - A0A2R8VHN3 CDS 3' incomplete 208 coding

Pcbp2- ENSMUST00000230682.1 575 192aa ENSMUSP00000155638.1 Protein - A0A2R8VI15 CDS 5' and 3' 226 coding incomplete

Pcbp2- ENSMUST00000229432.1 553 181aa ENSMUSP00000155848.1 Protein - A0A2R8VI92 CDS 5' incomplete 210 coding

Pcbp2- ENSMUST00000229275.1 516 96aa ENSMUSP00000155124.1 Protein - A0A2R8VHI3 GENCODE basic 209 coding

Pcbp2- ENSMUST00000230129.1 1230 No - Retained - - - 220 protein intron

Pcbp2- ENSMUST00000230997.1 909 No - Retained - - - 229 protein intron

Pcbp2- ENSMUST00000230631.1 585 No - Retained - - - 225 protein intron

Pcbp2- ENSMUST00000230282.1 427 No - Retained - - - 222 protein intron

Pcbp2- ENSMUST00000229822.1 2048 No - lncRNA - - - 215 protein

49.52 kb Forward strand 102.47Mb 102.48Mb 102.49Mb 102.50Mb 102.51Mb Prr13-203 >protein coding Pcbp2-220 >retained intron Pcbp2-225 >retained intron Gm10337-201 >protein coding (Comprehensive set...

Prr13-201 >protein coding Pcbp2-201 >protein coding

Prr13-204 >protein coding Pcbp2-205 >protein coding

Prr13-205 >protein coding Pcbp2-212 >protein coding

Prr13-202 >protein coding Pcbp2-209 >protein coding Pcbp2-226 >protein coding

Pcbp2-231 >protein coding

Pcbp2-214 >protein coding

Pcbp2-216 >protein coding

Pcbp2-203 >protein coding

Pcbp2-208 >protein coding Pcbp2-204 >protein coding

Pcbp2-217 >protein coding

Pcbp2-202 >protein coding

Pcbp2-218 >protein coding

Pcbp2-206 >protein coding

Pcbp2-227 >protein coding

Pcbp2-219 >protein coding

Pcbp2-213 >protein coding

Pcbp2-221 >protein coding

Pcbp2-223 >proteinPage coding 8 of 10

Pcbp2-207 >protein coding

Pcbp2-228 >protein coding

Pcbp2-224 >protein coding

Pcbp2-211 >protein coding

Pcbp2-230 >protein coding

Pcbp2-210 >protein coding

Pcbp2-215 >processed transcript

Pcbp2-222 >retained intron

Gm27406-201 >misc RNA

Pcbp2-229 >retained intron

Contigs AC137156.3 >

Genes < Map3k12-201protein coding (Comprehensive set...

< Map3k12-208protein coding

< Map3k12-207protein coding

< Map3k12-202retained intron

< Map3k12-204retained intron

< Map3k12-206protein coding

< Map3k12-203protein coding

< Map3k12-205protein coding

Regulatory Build

102.47Mb 102.48Mb 102.49Mb 102.50Mb 102.51Mb Reverse strand 49.52 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript 49.52 kb Forward strand 102.47Mb 102.48Mb 102.49Mb 102.50Mb 102.51Mb Genes Prr13-203 >protein coding Pcbp2-220 >retained intron Pcbp2-225 >retained intron Gm10337-201 >protein coding (Comprehensive set...

Prr13-201 >protein coding Pcbp2-201 >protein coding

Prr13-204 >protein coding Pcbp2-205 >protein coding

Prr13-205 >protein coding Pcbp2-212 >protein coding

Prr13-202 >protein coding Pcbp2-209 >protein coding Pcbp2-226 >protein coding

Pcbp2-231 >protein coding

Pcbp2-214 >protein coding

Pcbp2-216 >protein coding

Pcbp2-203 >protein coding

Pcbp2-208 >protein coding Pcbp2-204 >protein coding

Pcbp2-217 >protein coding

Pcbp2-202 >protein coding

Pcbp2-218 >protein coding

Pcbp2-206 >protein coding

Pcbp2-227 >protein coding

Pcbp2-219 >protein coding

Pcbp2-213 >protein coding

Pcbp2-221 >protein coding http://www.alphaknockout.com/

Pcbp2-223 >protein coding

Pcbp2-207 >protein coding

Pcbp2-228 >protein coding

Pcbp2-224 >protein coding

Pcbp2-211 >protein coding

Pcbp2-230 >protein coding

Pcbp2-210 >protein coding

Pcbp2-215 >processed transcript

Pcbp2-222 >retained intron

Gm27406-201 >misc RNA

Pcbp2-229 >retained intron

Contigs AC137156.3 >

Genes < Map3k12-201protein coding (Comprehensive set...

< Map3k12-208protein coding

< Map3k12-207protein coding

< Map3k12-202retained intron

< Map3k12-204retained intron

< Map3k12-206protein coding

< Map3k12-203protein coding

< Map3k12-205protein coding

Regulatory Build

102.47Mb 102.48Mb 102.49Mb 102.50Mb 102.51Mb Reverse strand 49.52 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 9 of 10 http://www.alphaknockout.com/

Transcript: ENSMUST00000077037

29.51 kb Forward strand

Pcbp2-201 >protein coding

ENSMUSP00000076... Superfamily K Homology domain, type 1 superfamily SMART K Homology domain Pfam K Homology domain, type 1 PROSITE profiles PS50084 PANTHER PTHR10288:SF97

PTHR10288 Gene3D K Homology domain, type 1 superfamily CDD cd02396 cd00105

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop retained variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 362

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

Page 10 of 10