https://www.alphaknockout.com

Mouse Nubp2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Nubp2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Nubp2 (NCBI Reference Sequence: NM_011956 ; Ensembl: ENSMUSG00000039183 ) is located on Mouse 17. 7 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 7 (Transcript: ENSMUST00000044252). Exon 2~6 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Nubp2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-397O23 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2~6 is not frameshift exon, and covers 79.27% of the coding region. The size of intron 1 for 5'-loxP site insertion: 441 bp, and the size of intron 6 for 3'-loxP site insertion: 491 bp. The size of effective cKO region: ~2497 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

2 1 1 2 3 4 5 6 7 2 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Spsb3 Exon of mouse Nubp2 cKO region Exon of mouse Igfals

loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8962bp) | A(22.73% 2037) | C(27.73% 2485) | T(22.18% 1988) | G(27.36% 2452)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 - 24886058 24889057 3000 browser details YourSeq 176 554 765 3000 91.9% chr10 + 42841075 42841286 212 browser details YourSeq 72 625 722 3000 89.2% chr2 + 126898998 126899121 124 browser details YourSeq 71 607 726 3000 89.1% chr4 + 137775661 137775816 156 browser details YourSeq 69 632 725 3000 91.6% chr11 + 101662955 101663062 108 browser details YourSeq 68 632 724 3000 92.6% chr1 + 63140465 63140573 109 browser details YourSeq 64 628 724 3000 88.1% chr14 - 99358103 99358241 139 browser details YourSeq 64 628 724 3000 93.3% chr12 + 80626621 80626750 130 browser details YourSeq 63 632 724 3000 88.1% chr12 + 55696821 55696955 135 browser details YourSeq 61 632 724 3000 91.9% chr13 + 23280397 23280515 119 browser details YourSeq 61 632 723 3000 88.7% chr10 + 67265791 67265923 133 browser details YourSeq 57 632 723 3000 91.4% chr11 + 13465381 13465512 132 browser details YourSeq 56 638 716 3000 90.0% chr19 + 48958117 48958236 120 browser details YourSeq 51 637 724 3000 90.5% chr10 + 68699516 68699644 129 browser details YourSeq 50 668 724 3000 94.8% chr10 - 42685440 42685498 59 browser details YourSeq 50 690 810 3000 94.6% chr3 + 40864233 40864606 374 browser details YourSeq 48 634 698 3000 92.8% chr9 - 114482991 114483055 65 browser details YourSeq 48 632 723 3000 94.5% chr1 + 159906980 159907112 133 browser details YourSeq 46 634 722 3000 91.1% chr7 - 113711363 113711491 129 browser details YourSeq 46 667 723 3000 96.0% chr3 + 88306136 88306366 231

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 - 24880561 24883560 3000 browser details YourSeq 38 138 201 3000 91.4% chr18 + 75065204 75065268 65 browser details YourSeq 33 191 494 3000 48.8% chr4 - 27576495 27576537 43 browser details YourSeq 32 176 214 3000 92.4% chr15 + 103326956 103326995 40 browser details YourSeq 31 189 229 3000 94.3% chr11 + 94229054 94229097 44 browser details YourSeq 30 189 222 3000 94.2% chr17 - 24748549 24748582 34 browser details YourSeq 29 196 224 3000 100.0% chr3 - 104909613 104909641 29 browser details YourSeq 28 193 222 3000 96.7% chr3 - 28253697 28253726 30 browser details YourSeq 27 189 217 3000 96.6% chr16 - 5100590 5100618 29 browser details YourSeq 26 189 216 3000 96.5% chr15 - 79314830 79314857 28 browser details YourSeq 26 191 220 3000 93.4% chr5 + 140766496 140766525 30 browser details YourSeq 24 189 216 3000 92.9% chr13 + 13609500 13609527 28

Note: The 3000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Nubp2 nucleotide binding protein 2 [ Mus musculus (house mouse) ] Gene ID: 26426, updated on 24-Oct-2019

Gene summary

Official Symbol Nubp2 provided by MGI Official Full Name nucleotide binding protein 2 provided by MGI Primary source MGI:MGI:1347072 See related Ensembl:ENSMUSG00000039183 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as D17Wsu11e Expression Ubiquitous expression in testis adult (RPKM 44.5), thymus adult (RPKM 22.0) and 28 other tissues See more Orthologs human all

Genomic context

Location: 17 A3.3; 17 12.53 cM See Nubp2 in Genome Data Viewer

Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (24882611..24886427, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (25019556..25023295, complement)

Chromosome 17 - NC_000083.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 15 transcripts

Gene: Nubp2 ENSMUSG00000039183

Description nucleotide binding protein 2 [Source:MGI Symbol;Acc:MGI:1347072] Gene Synonyms D17Wsu11e Location Chromosome 17: 24,882,611-24,886,349 reverse strand. GRCm38:CM001010.2 About this gene This gene has 15 transcripts (splice variants), 191 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Nubp2- ENSMUST00000044252.6 1427 275aa ENSMUSP00000049319.5 Protein coding CCDS28500 Q9R061 TSL:1 201 GENCODE basic APPRIS P1

Nubp2- ENSMUST00000234968.1 1502 245aa ENSMUSP00000157175.1 Protein coding - A0A3Q4EGU9 GENCODE 214 basic

Nubp2- ENSMUST00000234928.1 1471 211aa ENSMUSP00000157378.1 Protein coding - A0A3Q4EH93 GENCODE 213 basic

Nubp2- ENSMUST00000234597.1 1434 211aa ENSMUSP00000157064.1 Protein coding - A0A3Q4EH93 GENCODE 209 basic

Nubp2- ENSMUST00000234598.1 1168 189aa ENSMUSP00000157118.1 Protein coding - A0A3Q4EGB1 CDS 5' 210 incomplete

Nubp2- ENSMUST00000234583.1 899 251aa ENSMUSP00000157034.1 Protein coding - A0A3Q4L2S1 GENCODE 208 basic

Nubp2- ENSMUST00000234579.1 1445 61aa ENSMUSP00000157158.1 Nonsense mediated - A0A3Q4L2Y0 - 207 decay

Nubp2- ENSMUST00000234489.1 1423 133aa ENSMUSP00000157348.1 Nonsense mediated - A0A3Q4EIH1 - 206 decay

Nubp2- ENSMUST00000234674.1 1393 64aa ENSMUSP00000157349.1 Nonsense mediated - A0A3Q4EGS2 - 211 decay

Nubp2- ENSMUST00000234352.1 1224 50aa ENSMUSP00000157112.1 Nonsense mediated - A0A3Q4EGA7 - 205 decay

Nubp2- ENSMUST00000234261.1 1221 65aa ENSMUSP00000157043.1 Nonsense mediated - A0A3Q4EBU9 - 203 decay

Nubp2- ENSMUST00000234293.1 1191 46aa ENSMUSP00000157234.1 Nonsense mediated - A0A3Q4L320 - 204 decay

Nubp2- ENSMUST00000235055.1 950 58aa ENSMUSP00000157236.1 Nonsense mediated - A0A3Q4EC31 - 215 decay

Nubp2- ENSMUST00000234915.1 649 No - Retained intron - - - 212 protein

Nubp2- ENSMUST00000234098.1 1440 No - lncRNA - - - 202 protein

23.74 kb Forward strand 24.875Mb 24.880Mb 24.885Mb 24.890Mb 24.895Mb Igfals-205 >lncRNA Spsb3-203 >protein coding Mrps34-202 >retained intron (Comprehensive set...

Igfals-203 >lncRNA Spsb3-209 >protein coding Mrps34-201 >protein coding

Igfals-206 >lncRNA Spsb3-202 >protein coding

Igfals-202 >lncRNA Spsb3-205 >protein coding

Igfals-201 >proteinP caodgineg 6 of 8 Spsb3-206 >lncRNA Mrps34-203 >nonsense mediated decay

Igfals-204 >lncRNA Spsb3-204 >protein coding

Spsb3-207 >retained intron

Spsb3-201 >protein coding

Spsb3-208 >lncRNA

Contigs < AC166102.2 Genes (Comprehensive set... < Nubp2-207nonsense mediated decay < Eme2-201protein coding

< Nubp2-201protein coding < Mapk8ip3-216nonsense mediated decay

< Nubp2-202lncRNA < Eme2-205nonsense mediated decay

< Nubp2-203nonsense mediated decay < Eme2-202protein coding

< Nubp2-213protein coding < Eme2-208retained intron

< Nubp2-209protein coding < Eme2-207retained intron

< Nubp2-211nonsense mediated decay < Eme2-206retained intron

< Nubp2-210protein coding < Eme2-203retained intron

< Nubp2-205nonsense mediated decay < Eme2-204retained intron

< Nubp2-214protein coding

< Nubp2-204nonsense mediated decay

< Nubp2-206nonsense mediated decay

< Nubp2-208protein coding

< Nubp2-215nonsense mediated decay

< Nubp2-212retained intron

Regulatory Build

24.875Mb 24.880Mb 24.885Mb 24.890Mb 24.895Mb Reverse strand 23.74 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript 23.74 kb Forward strand 24.875Mb 24.880Mb 24.885Mb 24.890Mb 24.895Mb Genes Igfals-205 >lncRNA Spsb3-203 >protein coding Mrps34-202 >retained intron (Comprehensive set...

Igfals-203 >lncRNA Spsb3-209 >protein coding Mrps34-201 >protein coding

Igfals-206 >lncRNA Spsb3-202 >protein coding

Igfals-202 >lncRNA Spsb3-205 >protein coding https://www.alphaknockout.com

Igfals-201 >protein coding Spsb3-206 >lncRNA Mrps34-203 >nonsense mediated decay

Igfals-204 >lncRNA Spsb3-204 >protein coding

Spsb3-207 >retained intron

Spsb3-201 >protein coding

Spsb3-208 >lncRNA

Contigs < AC166102.2 Genes (Comprehensive set... < Nubp2-207nonsense mediated decay < Eme2-201protein coding

< Nubp2-201protein coding < Mapk8ip3-216nonsense mediated decay

< Nubp2-202lncRNA < Eme2-205nonsense mediated decay

< Nubp2-203nonsense mediated decay < Eme2-202protein coding

< Nubp2-213protein coding < Eme2-208retained intron

< Nubp2-209protein coding < Eme2-207retained intron

< Nubp2-211nonsense mediated decay < Eme2-206retained intron

< Nubp2-210protein coding < Eme2-203retained intron

< Nubp2-205nonsense mediated decay < Eme2-204retained intron

< Nubp2-214protein coding

< Nubp2-204nonsense mediated decay

< Nubp2-206nonsense mediated decay

< Nubp2-208protein coding

< Nubp2-215nonsense mediated decay

< Nubp2-212retained intron

Regulatory Build

24.875Mb 24.880Mb 24.885Mb 24.890Mb 24.895Mb Reverse strand 23.74 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000044252

< Nubp2-201protein coding

Reverse strand 3.74 kb

ENSMUSP00000049... Superfamily P-loop containing hydrolase Flagellum site-determining protein YlxH/ Fe-S cluster assembling factor NBP35 PROSITE patterns Mrp, conserved site PANTHER Mrp/NBP35 ATP-binding protein

PTHR23264:SF32 HAMAP Mrp/NBP35 ATP-binding protein

Cytosolic Fe-S cluster assembly factor NUBP2/Cfd1, Gene3D 3.40.50.300 CDD cd02037

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 275

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8