https://www.alphaknockout.com

Mouse Hnrnpc Knockout Project (CRISPR/Cas9)

Objective: To create a Hnrnpc knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Hnrnpc (NCBI Reference Sequence: NM_016884 ; Ensembl: ENSMUSG00000060373 ) is located on Mouse 14. 9 exons are identified, with the ATG start codon in exon 3 and the TAA stop codon in exon 9 (Transcript: ENSMUST00000111610). Exon 3~9 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a gene trapped allele fail to undergo gastrulation, appear to arrest at the egg cylinder stage, and are resorbed at various times thereafter.

Exon 3 starts from about 0.11% of the coding region. Exon 3~9 covers 100.0% of the coding region. The size of effective KO region: ~9271 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 7 8 9

Legends Exon of mouse Hnrnpc Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(30.5% 610) | C(15.45% 309) | T(33.9% 678) | G(20.15% 403)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.05% 541) | C(19.25% 385) | T(34.55% 691) | G(19.15% 383)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr14 - 52084356 52086355 2000 browser details YourSeq 179 859 1052 2000 94.8% chr11 - 4851436 4851626 191 browser details YourSeq 178 664 1044 2000 93.7% chr9 - 23247903 23248516 614 browser details YourSeq 176 856 1060 2000 94.8% chr18 + 24068297 24068499 203 browser details YourSeq 174 860 1044 2000 97.9% chr5 - 129898273 129898460 188 browser details YourSeq 174 859 1045 2000 97.3% chr2 + 31671706 31671895 190 browser details YourSeq 174 859 1459 2000 84.9% chr11 + 72656896 72657114 219 browser details YourSeq 173 860 1045 2000 97.9% chr16 - 21712164 21712351 188 browser details YourSeq 173 859 1044 2000 97.3% chr12 - 111513283 111513471 189 browser details YourSeq 171 859 1044 2000 96.8% chr5 + 66007499 66007684 186 browser details YourSeq 170 861 1052 2000 95.7% chrX - 8236098 8236289 192 browser details YourSeq 170 859 1044 2000 96.8% chr9 - 32213055 32213240 186 browser details YourSeq 170 862 1044 2000 97.3% chr11 - 77401638 77401824 187 browser details YourSeq 170 859 1044 2000 96.8% chr11 - 69819342 69819532 191 browser details YourSeq 170 859 1045 2000 96.3% chr2 + 58624055 58624244 190 browser details YourSeq 169 859 1045 2000 96.3% chr4 + 123818437 123818626 190 browser details YourSeq 169 859 1044 2000 96.2% chr11 + 98841786 98841974 189 browser details YourSeq 169 859 1046 2000 94.6% chr1 + 156855275 156855461 187 browser details YourSeq 168 859 1067 2000 93.9% chr8 - 121256871 121257098 228 browser details YourSeq 168 858 1044 2000 96.2% chr10 - 78416039 78416241 203

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr14 - 52073083 52075082 2000 browser details YourSeq 598 1 633 2000 97.3% chr4 - 97520801 97521430 630 browser details YourSeq 533 1 631 2000 92.7% chr15 - 36119181 36120000 820 browser details YourSeq 399 1 613 2000 88.8% chr6 - 75376472 75383127 6656 browser details YourSeq 266 1 288 2000 96.2% chr19 + 29615200 29615487 288 browser details YourSeq 220 1 281 2000 92.0% chr4 - 46273533 46273821 289 browser details YourSeq 127 577 792 2000 91.1% chr15 + 44871108 44871338 231 browser details YourSeq 126 234 630 2000 84.1% chr3 - 27197454 27197845 392 browser details YourSeq 126 652 849 2000 91.0% chr12 - 4394439 4395046 608 browser details YourSeq 124 651 793 2000 93.8% chr9 - 88833121 88833265 145 browser details YourSeq 123 650 799 2000 91.4% chr16 + 18058211 18058362 152 browser details YourSeq 122 650 796 2000 91.9% chr13 - 40809413 40809561 149 browser details YourSeq 122 650 789 2000 93.6% chr11 + 106187236 106187375 140 browser details YourSeq 120 650 793 2000 91.7% chr8 - 87523691 87523834 144 browser details YourSeq 120 650 793 2000 92.4% chr14 - 70704089 70704561 473 browser details YourSeq 120 650 812 2000 84.2% chr11 + 54668006 54668156 151 browser details YourSeq 119 650 788 2000 92.9% chr4 - 132206750 132206888 139 browser details YourSeq 119 650 792 2000 91.7% chr12 - 12825159 12825301 143 browser details YourSeq 118 650 797 2000 89.9% chr17 + 26626726 26626873 148 browser details YourSeq 118 657 793 2000 93.5% chr14 + 20322084 20322392 309

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Hnrnpc heterogeneous nuclear ribonucleoprotein C [ Mus musculus (house mouse) ] Gene ID: 15381, updated on 14-Aug-2019

Gene summary

Official Symbol Hnrnpc provided by MGI Official Full Name heterogeneous nuclear ribonucleoprotein C provided by MGI Primary source MGI:MGI:107795 See related Ensembl:ENSMUSG00000060373 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Hnrpc; Hnrpc1; Hnrpc2; hnRNPC1; hnRNPC2; hnrnp-C; AL022939; D14Wsu171e Expression Broad expression in CNS E11.5 (RPKM 123.0), CNS E14 (RPKM 61.1) and 23 other tissues See more Orthologs human all

Genomic context

Location: 14 C2; 14 26.79 cM See Hnrnpc in Genome Data Viewer Exon count: 9

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 14 NC_000080.6 (52073377..52104039, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 14 NC_000080.5 (52693055..52723703, complement)

Chromosome 14 - NC_000080.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 13 transcripts

Gene: Hnrnpc ENSMUSG00000060373

Description heterogeneous nuclear ribonucleoprotein C [Source:MGI Symbol;Acc:MGI:107795] Gene Synonyms D14Wsu171e, Hnrpc, Hnrpc1, Hnrpc2, hnRNP C1, hnRNP C2, hnRNPC1, hnRNPC2 Location : 52,073,377-52,104,028 reverse strand. GRCm38:CM001007.2 About this gene This gene has 13 transcripts (splice variants), 191 orthologues, 2 paralogues, is a member of 2 Ensembl protein families and is associated with 8 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Hnrnpc-201 ENSMUST00000111610.11 2847 313aa ENSMUSP00000107237.4 Protein coding CCDS36917 Q9Z204 TSL:1 GENCODE basic

Hnrnpc-202 ENSMUST00000164655.1 2843 313aa ENSMUSP00000133052.1 Protein coding CCDS36917 Q9Z204 TSL:5 GENCODE basic

Hnrnpc-205 ENSMUST00000227242.1 2617 300aa ENSMUSP00000154757.1 Protein coding - Q9Z204 GENCODE basic

Hnrnpc-210 ENSMUST00000228232.1 2535 293aa ENSMUSP00000154619.1 Protein coding - Q9Z204 GENCODE basic APPRIS ALT2

Hnrnpc-209 ENSMUST00000228198.1 2498 306aa ENSMUSP00000154212.1 Protein coding - Q9Z204 GENCODE basic APPRIS P5

Hnrnpc-211 ENSMUST00000228748.1 1691 293aa ENSMUSP00000154166.1 Protein coding - Q9Z204 GENCODE basic APPRIS ALT2

Hnrnpc-206 ENSMUST00000227458.1 1683 292aa ENSMUSP00000154238.1 Protein coding - Q9Z204 GENCODE basic APPRIS ALT1

Hnrnpc-207 ENSMUST00000227536.1 1301 300aa ENSMUSP00000154737.1 Protein coding - Q9Z204 GENCODE basic

Hnrnpc-213 ENSMUST00000228815.1 755 28aa ENSMUSP00000154347.1 Protein coding - A0A2I3BQW7 CDS 3' incomplete

Hnrnpc-204 ENSMUST00000227195.1 745 117aa ENSMUSP00000154537.1 Protein coding - A0A2I3BRM6 CDS 3' incomplete

Hnrnpc-203 ENSMUST00000226993.1 582 80aa ENSMUSP00000154182.1 Protein coding - A0A2I3BQH3 CDS 3' incomplete

Hnrnpc-208 ENSMUST00000228045.1 660 No protein - Retained intron - - -

Hnrnpc-212 ENSMUST00000228786.1 611 No protein - Retained intron - - -

Page 7 of 9 https://www.alphaknockout.com

50.65 kb Forward strand 52.07Mb 52.08Mb 52.09Mb 52.10Mb 52.11Mb Gm22354-201 >snoRNA Rpgrip1-208 >nonsense mediated decay (Comprehensive set...

Gm8586-201 >processed pseudogene Rpgrip1-205 >retained intron

Rpgrip1-218 >retained intron

Rpgrip1-202 >protein coding

Rpgrip1-201 >protein coding

Rpgrip1-213 >protein coding

Contigs < AC157572.4 Genes (Comprehensive set... < Gm49258-201processed pseudog

< Hnrnpc-201protein coding

< Hnrnpc-202protein coding

< Hnrnpc-205protein coding

< Hnrnpc-209protein coding

< Hnrnpc-211protein coding

< Hnrnpc-206protein coding

< Hnrnpc-207protein coding

< Hnrnpc-210protein coding

< Hnrnpc-208retained intron

< Hnrnpc-203protein coding

< Hnrnpc-213protein coding

< Hnrnpc-212retained intron

Regulatory Build

52.07Mb 52.08Mb 52.09Mb 52.10Mb 52.11Mb Reverse strand 50.65 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript pseudogene RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000111610

< Hnrnpc-201protein coding

Reverse strand 30.65 kb

ENSMUSP00000107... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily RNA-binding domain superfamily

SMART RNA recognition motif domain

Pfam RNA recognition motif domain

PROSITE profiles RNA recognition motif domain

PIRSF Heterogeneous nuclear ribonucleoprotein C

PANTHER PTHR13968

PTHR13968:SF3 Gene3D Nucleotide-binding alpha-beta plait domain superfamily

CDD cd12603

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 313

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9