https://www.alphaknockout.com

Mouse Paxip1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Paxip1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Paxip1 (NCBI Reference Sequence: NM_018878 ; Ensembl: ENSMUSG00000002221 ) is located on Mouse 5. 21 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 21 (Transcript: ENSMUST00000002291). Exon 4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Paxip1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-25H24 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous mutant mice are developmentally retarded and embyronic lethal by E9.5.

Exon 4 starts from about 8.24% of the coding region. The knockout of Exon 4 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 2244 bp, and the size of intron 4 for 3'-loxP site insertion: 5742 bp. The size of effective cKO region: ~564 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 4 21 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Paxip1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7064bp) | A(24.89% 1758) | C(19.78% 1397) | T(30.97% 2188) | G(24.36% 1721)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 27781736 27784735 3000 browser details YourSeq 97 279 459 3000 87.1% chr12 + 55445497 55445688 192 browser details YourSeq 90 328 535 3000 81.8% chr2 + 156692659 156692845 187 browser details YourSeq 85 303 459 3000 88.9% chr8 + 36178905 36179078 174 browser details YourSeq 80 277 459 3000 87.1% chr3 - 97644200 97834641 190442 browser details YourSeq 77 293 459 3000 88.9% chr6 + 35264936 35265109 174 browser details YourSeq 76 272 374 3000 92.4% chr5 + 141222729 141222851 123 browser details YourSeq 74 335 458 3000 89.4% chr2 - 3272560 3272694 135 browser details YourSeq 74 277 459 3000 88.6% chr17 - 28465794 28465993 200 browser details YourSeq 72 321 682 3000 95.1% chr4 - 54912010 54912547 538 browser details YourSeq 71 330 572 3000 92.8% chr2 + 146548266 146548727 462 browser details YourSeq 71 279 464 3000 82.3% chr1 + 153545980 153546155 176 browser details YourSeq 70 320 423 3000 85.9% chr10 + 18324061 18324177 117 browser details YourSeq 69 303 423 3000 87.1% chr7 + 101045374 101045510 137 browser details YourSeq 67 321 404 3000 91.5% chr3 - 152295787 152295883 97 browser details YourSeq 67 337 463 3000 93.6% chr2 + 166778951 166779088 138 browser details YourSeq 66 294 404 3000 84.3% chr2 - 73709881 73709990 110 browser details YourSeq 66 293 374 3000 92.7% chr9 + 121743150 121843908 100759 browser details YourSeq 65 115 458 3000 71.7% chrX + 94131098 94131251 154 browser details YourSeq 64 321 484 3000 93.3% chr9 - 57856993 57857506 514

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr5 - 27778172 27781171 3000 browser details YourSeq 131 2519 2870 3000 82.0% chr12 - 25347169 25347642 474 browser details YourSeq 123 2510 2851 3000 89.8% chr8 - 106427949 106428300 352 browser details YourSeq 102 2511 2806 3000 91.9% chr9 + 32318193 32318571 379 browser details YourSeq 96 2509 2725 3000 76.4% chr7 - 99376600 99376815 216 browser details YourSeq 95 2588 2806 3000 90.0% chr14 - 66888430 66888723 294 browser details YourSeq 89 2528 2806 3000 93.4% chr13 + 74170492 74170785 294 browser details YourSeq 84 2510 2724 3000 85.6% chr9 - 62568482 62568769 288 browser details YourSeq 84 2589 2806 3000 89.8% chr15 + 7841264 7841479 216 browser details YourSeq 80 2584 2724 3000 85.5% chr5 + 101626128 101626266 139 browser details YourSeq 79 2514 2763 3000 86.8% chr5 + 128658419 128658702 284 browser details YourSeq 79 2575 2744 3000 90.7% chr12 + 17009229 17009425 197 browser details YourSeq 78 2537 2724 3000 87.4% chr2 - 129266227 129266416 190 browser details YourSeq 78 2509 2740 3000 78.6% chr14 - 33242183 33242421 239 browser details YourSeq 78 2519 2805 3000 84.4% chr11 + 55280398 55280682 285 browser details YourSeq 77 2537 2724 3000 87.3% chr11 - 4011068 4011260 193 browser details YourSeq 75 2520 2724 3000 87.8% chr7 - 138053041 138053259 219 browser details YourSeq 75 2511 2733 3000 77.0% chr2 + 139768128 139768330 203 browser details YourSeq 74 2513 2744 3000 85.8% chr3 - 40898591 40898819 229 browser details YourSeq 73 2509 2717 3000 85.3% chr14 - 58681745 58681951 207

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Paxip1 PAX interacting (with transcription-activation domain) protein 1 [ Mus musculus (house mouse) ] Gene ID: 55982, updated on 12-Aug-2019

Gene summary

Official Symbol Paxip1 provided by MGI Official Full Name PAX interacting (with transcription-activation domain) protein 1 provided by MGI Primary source MGI:MGI:1890430 See related Ensembl:ENSMUSG00000002221 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as PTIP; D5Ertd149e Summary This gene encodes a nuclear-localized protein that contains six BRCT1 (C-terminal of breast cancer susceptibility protein) Expression domains. The encoded protein is involved in the repair of DNA double-strand breaks and is necessary for progression through cell division. The protein also functions in the regulation of transcription by recruiting histone methyltransferases to gene promoters bound by the sequence-specific transcription factor paired box protein 2 (Pax2). [provided by RefSeq, Mar 2013] Orthologs Ubiquitous expression in thymus adult (RPKM 12.1), ovary adult (RPKM 11.4) and 28 other tissues See more human all

Genomic context

Location: 5 B1; 5 13.23 cM See Paxip1 in Genome Data Viewer

Exon count: 22

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (27740080..27791672, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (28067205..28117879, complement)

Chromosome 5 - NC_000071.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Paxip1 ENSMUSG00000002221

Description PAX interacting (with transcription-activation domain) protein 1 [Source:MGI Symbol;Acc:MGI:1890430] Gene Synonyms D5Ertd149e, PTIP Location Chromosome 5: 27,740,080-27,791,691 reverse strand. GRCm38:CM000998.2 About this gene This gene has 7 transcripts (splice variants), 203 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 34 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Paxip1- ENSMUST00000002291.11 5850 1056aa ENSMUSP00000002291.7 Protein coding CCDS39039 Q6NZQ4 TSL:1 201 GENCODE basic APPRIS P1

Paxip1- ENSMUST00000196734.1 749 92aa ENSMUSP00000142578.1 Nonsense mediated - A0A0G2JE02 TSL:3 204 decay

Paxip1- ENSMUST00000199714.1 3829 No - Retained intron - - TSL:NA 206 protein

Paxip1- ENSMUST00000197625.1 736 No - Retained intron - - TSL:2 205 protein

Paxip1- ENSMUST00000196641.1 717 No - Retained intron - - TSL:2 203 protein

Paxip1- ENSMUST00000196605.1 515 No - Retained intron - - TSL:2 202 protein

Paxip1- ENSMUST00000199993.4 381 No - lncRNA - - TSL:3 207 protein

Page 6 of 8 https://www.alphaknockout.com

71.61 kb Forward strand 27.74Mb 27.76Mb 27.78Mb 27.80Mb Contigs < AC146612.2 Genes (Comprehensive set... < Paxip1-201protein coding

< Paxip1-205retained intron < Paxip1-203retained intron < Paxip1-204nonsense mediated decay

< Paxip1-207lncRNA

< Paxip1-202retained intron

< Paxip1-206retained intron

Regulatory Build

27.74Mb 27.76Mb 27.78Mb 27.80Mb Reverse strand 71.61 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000002291

< Paxip1-201protein coding

Reverse strand 51.47 kb

ENSMUSP00000002... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily BRCT domain superfamily SMART BRCT domain Pfam BRCT domain BRCT domain

BRCT domain PROSITE profiles BRCT domain PANTHER PTHR23196

PTHR23196:SF1 Gene3D BRCT domain superfamily CDD cd17714 cd17710 cd17711 cd17730 cd17712 cd18440

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe deletion missense variant splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1056

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8