https://www.alphaknockout.com

Mouse Prpf19 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Prpf19 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated engineering.

Strategy summary: The Prpf19 (NCBI Reference Sequence: NM_001253843 ; Ensembl: ENSMUSG00000024735 ) is located on Mouse chromosome 19. 16 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 16 (Transcript: ENSMUST00000179297). Exon 2~3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Prpf19 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-404I24 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by analysis. Note: Mice homozygous for a null allele die prior to implantation and have defective cell proliferation.

Exon 2 starts from about 1.27% of the . The knockout of Exon 2~3 will result in frameshift of the gene. The size of 1 for 5'-loxP site insertion: 2034 bp, and the size of intron 3 for 3'-loxP site insertion: 1032 bp. The size of effective cKO region: ~891 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 4 5 6 16 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Prpf19 cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7391bp) | A(23.6% 1744) | C(23.27% 1720) | T(27.56% 2037) | G(25.57% 1890)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 + 10894254 10897253 3000 browser details YourSeq 158 990 1215 3000 87.9% chr18 + 24257196 24257396 201 browser details YourSeq 98 100 282 3000 77.4% chr19 + 5913751 5913897 147 browser details YourSeq 78 97 402 3000 72.9% chr11 - 59258356 59258563 208 browser details YourSeq 76 97 204 3000 85.2% chr19 + 15219145 15219252 108 browser details YourSeq 74 100 204 3000 85.8% chr10 + 128552948 128553174 227 browser details YourSeq 73 100 421 3000 91.0% chr15 - 76148284 76148652 369 browser details YourSeq 71 100 200 3000 85.2% chr13 + 64119060 64119160 101 browser details YourSeq 69 116 204 3000 88.8% chr11 + 100309434 100309522 89 browser details YourSeq 68 97 200 3000 82.7% chr12 + 96252420 96252523 104 browser details YourSeq 67 116 204 3000 87.7% chr12 - 58253978 58254066 89 browser details YourSeq 67 115 204 3000 90.4% chr19 + 4235378 4235673 296 browser details YourSeq 66 100 197 3000 83.7% chr2 + 55983666 55983763 98 browser details YourSeq 65 18 198 3000 84.3% chr2 - 94424372 94424552 181 browser details YourSeq 65 100 206 3000 80.4% chr11 - 83296825 83296931 107 browser details YourSeq 64 97 200 3000 87.3% chr17 - 8069456 8069563 108 browser details YourSeq 64 97 204 3000 79.7% chr11 + 53235168 53235275 108 browser details YourSeq 62 127 204 3000 89.8% chr7 - 4493853 4493930 78 browser details YourSeq 62 146 230 3000 87.0% chr1 - 135522492 135522577 86 browser details YourSeq 61 116 204 3000 87.7% chr15 - 69472163 69472251 89

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr19 + 10898145 10901144 3000 browser details YourSeq 501 838 2893 3000 89.8% chr18 + 24257623 24258202 580 browser details YourSeq 65 20 213 3000 83.7% chr11 - 107122616 107123015 400 browser details YourSeq 56 1 74 3000 83.1% chr6 - 112779132 112779202 71 browser details YourSeq 55 7 87 3000 84.0% chr10 - 52330393 52330473 81 browser details YourSeq 54 18 128 3000 84.0% chr9 - 78149203 78149401 199 browser details YourSeq 53 1 96 3000 78.2% chr9 - 115606196 115606311 116 browser details YourSeq 53 18 86 3000 89.4% chr6 + 48471542 48471610 69 browser details YourSeq 51 7 86 3000 82.3% chr8 + 121677635 121677716 82 browser details YourSeq 50 18 86 3000 94.7% chr15 - 84387264 84387332 69 browser details YourSeq 50 7 86 3000 81.3% chr1 - 57518160 57518239 80 browser details YourSeq 49 18 86 3000 93.0% chr7 - 24596348 24596416 69 browser details YourSeq 49 22 86 3000 87.7% chr4 + 108036506 108036570 65 browser details YourSeq 49 18 86 3000 85.6% chr3 + 95333419 95333487 69 browser details YourSeq 49 18 86 3000 93.0% chr11 + 98443755 98443827 73 browser details YourSeq 48 18 86 3000 85.6% chr11 - 19911260 20030584 119325 browser details YourSeq 48 15 86 3000 83.4% chr4 + 147967231 147967302 72 browser details YourSeq 48 26 86 3000 94.5% chr11 + 91422177 91422237 61 browser details YourSeq 47 26 88 3000 87.4% chr7 - 37740340 37740402 63 browser details YourSeq 47 22 88 3000 94.4% chr4 - 33505895 33505961 67

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and protein information: Prpf19 pre-mRNA processing factor 19 [ Mus musculus (house mouse) ] Gene ID: 28000, updated on 12-Aug-2019

Gene summary

Official Symbol Prpf19 provided by MGI Official Full Name pre-mRNA processing factor 19 provided by MGI Primary source MGI:MGI:106247 See related Ensembl:ENSMUSG00000024735 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as PSO4; Snev; Prp19; NMP200; AA617263; AL024362; D19Wsu55e Expression Ubiquitous expression in ovary adult (RPKM 53.7), adrenal adult (RPKM 44.5) and 28 other tissues See more Orthologs human all

Genomic context

Location: 19 A; 19 7.33 cM See Prpf19 in Genome Data Viewer

Exon count: 17

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 19 NC_000085.6 (10895231..10909559)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 19 NC_000085.5 (10969782..10980022)

Chromosome 19 - NC_000085.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Prpf19 ENSMUSG00000024735

Description pre-mRNA processing factor 19 [Source:MGI Symbol;Acc:MGI:106247] Gene Synonyms D19Wsu55e, PSO4, Prp19, Snev Location Chromosome 19: 10,895,231-10,909,559 forward strand. GRCm38:CM001012.2 About this gene This gene has 8 transcripts (splice variants), 198 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 4 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Prpf19-203 ENSMUST00000179297.2 6035 523aa ENSMUSP00000136858.2 Protein coding CCDS57135 Q99KP6 TSL:1 GENCODE basic

Prpf19-201 ENSMUST00000025642.13 2134 504aa ENSMUSP00000025642.8 Protein coding CCDS79687 Q99KP6 TSL:1 GENCODE basic APPRIS P1

Prpf19-207 ENSMUST00000191092.6 784 No protein - Retained intron - - TSL:2

Prpf19-205 ENSMUST00000187751.6 781 No protein - Retained intron - - TSL:2

Prpf19-206 ENSMUST00000190275.1 708 No protein - Retained intron - - TSL:1

Prpf19-204 ENSMUST00000186416.1 601 No protein - Retained intron - - TSL:2

Prpf19-208 ENSMUST00000191552.1 260 No protein - Retained intron - - TSL:5

Prpf19-202 ENSMUST00000178868.7 1758 No protein - lncRNA - - TSL:1

Page 6 of 8 https://www.alphaknockout.com

34.33 kb Forward strand 10.89Mb 10.90Mb 10.91Mb (Comprehensive set... Prpf19-201 >protein coding

Prpf19-204 >retained intronPrpf19-208 >retained intron

Prpf19-205 >retained intron

Prpf19-206 >retained intron

Prpf19-203 >protein coding

Prpf19-202 >lncRNA

Prpf19-207 >retained intron

Contigs < AC132402.3 Genes < Zp1-202protein coding (Comprehensive set...

< Zp1-201protein coding

Regulatory Build

10.89Mb 10.90Mb 10.91Mb Reverse strand 34.33 kb

Regulation Legend CTCF Open Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000179297

14.15 kb Forward strand

Prpf19-203 >protein coding

ENSMUSP00000136... Low complexity (Seg) Superfamily SSF57850 WD40-repeat-containing domain superfamily

SMART U box domain WD40 repeat Prints G-protein beta WD-40 repeat Pfam U box domain Pre-mRNA-splicing factor 19 WD40 repeat

PROSITE profiles U box domain WD40 repeat

WD40-repeat-containing domain

Ricin B, lectin domain PROSITE patterns WD40 repeat, conserved site PANTHER PTHR43995:SF1

Pre-mRNA-processing factor 19 Gene3D Zinc finger, RING/FYVE/PHD-type WD40/YVTN repeat-like-containing domain superfamily

CDD cd16656 cd00200

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend frameshift variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 523

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8