https://www.alphaknockout.com

Mouse Ncapd2 Knockout Project (CRISPR/Cas9)

Objective: To create a Ncapd2 knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ncapd2 (NCBI Reference Sequence: NM_146171 ; Ensembl: ENSMUSG00000038252 ) is located on Mouse 6. 32 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 32 (Transcript: ENSMUST00000043848). Exon 3~13 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 3.07% of the coding region. Exon 3~13 covers 34.87% of the coding region. The size of effective KO region: ~7915 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 7 8 9 10 11 12 13 32

Legends Exon of mouse Ncapd2 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 3 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1313 bp section downstream of Exon 13 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(26.35% 527) | C(20.8% 416) | T(28.05% 561) | G(24.8% 496)

Note: The 2000 bp section upstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1313bp) | A(23.53% 309) | C(20.72% 272) | T(29.78% 391) | G(25.97% 341)

Note: The 1313 bp section downstream of Exon 13 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr6 - 125187153 125189152 2000 browser details YourSeq 124 744 1029 2000 93.2% chr2 - 164451137 164477618 26482 browser details YourSeq 118 689 855 2000 85.5% chr4 + 36339167 36339323 157 browser details YourSeq 115 701 858 2000 90.8% chr12 + 78934651 78934809 159 browser details YourSeq 114 706 878 2000 90.8% chr4 + 69124533 69124781 249 browser details YourSeq 114 701 851 2000 88.6% chr15 + 97575872 97576038 167 browser details YourSeq 110 700 851 2000 92.4% chr19 - 21298269 21298424 156 browser details YourSeq 109 744 1019 2000 91.7% chr15 - 81503051 81503437 387 browser details YourSeq 107 672 851 2000 87.5% chr5 - 140094598 140094773 176 browser details YourSeq 107 700 851 2000 92.9% chr12 + 70284757 70284908 152 browser details YourSeq 105 712 851 2000 86.5% chrX - 41903877 41904014 138 browser details YourSeq 105 698 852 2000 91.5% chr1 - 53817186 53817343 158 browser details YourSeq 105 711 855 2000 91.5% chr18 + 34707953 34708098 146 browser details YourSeq 103 706 851 2000 92.5% chr5 - 23637682 23637826 145 browser details YourSeq 103 713 851 2000 91.2% chr15 + 99708645 99708784 140 browser details YourSeq 103 689 851 2000 92.6% chr14 + 103124818 103124989 172 browser details YourSeq 100 706 851 2000 91.0% chr1 + 82780553 82780699 147 browser details YourSeq 99 706 857 2000 90.3% chr19 + 14698312 14698463 152 browser details YourSeq 99 691 849 2000 89.7% chr1 + 152465162 152465321 160 browser details YourSeq 98 744 878 2000 86.3% chrX - 134679640 134679765 126

Note: The 2000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1313 1 1313 1313 100.0% chr6 - 125177925 125179237 1313 browser details YourSeq 47 975 1202 1313 91.1% chr3 - 86032266 86032521 256 browser details YourSeq 42 975 1020 1313 97.8% chr10 - 116240998 116241048 51 browser details YourSeq 37 732 796 1313 78.5% chr12 - 85465802 85465866 65 browser details YourSeq 37 976 1083 1313 95.3% chr11 + 69808651 69808760 110 browser details YourSeq 35 973 1022 1313 92.7% chr2 - 11825275 11825331 57 browser details YourSeq 34 364 417 1313 94.8% chr11 - 106325304 106325376 73 browser details YourSeq 34 369 417 1313 97.3% chr1 - 77640867 77640916 50 browser details YourSeq 34 366 417 1313 92.5% chr5 + 135882179 135882232 54 browser details YourSeq 30 368 415 1313 90.7% chr8 - 121695525 121695571 47 browser details YourSeq 30 351 387 1313 90.7% chr6 + 136185255 136185290 36 browser details YourSeq 30 348 390 1313 96.9% chr4 + 70036825 70036869 45 browser details YourSeq 30 370 417 1313 96.9% chr3 + 129087072 129087121 50 browser details YourSeq 29 372 411 1313 96.8% chr3 - 20049545 20049586 42 browser details YourSeq 29 376 425 1313 91.5% chr5 + 63783364 63783415 52 browser details YourSeq 28 379 421 1313 85.3% chr11 + 66540861 66540902 42 browser details YourSeq 27 974 1019 1313 96.6% chr17 - 31588924 31588971 48 browser details YourSeq 27 975 1009 1313 96.6% chr1 - 178469891 178469926 36 browser details YourSeq 27 1046 1076 1313 93.6% chrX + 36486706 36486736 31 browser details YourSeq 27 948 988 1313 75.9% chr8 + 69804928 69804961 34

Note: The 1313 bp section downstream of Exon 13 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Ncapd2 non-SMC I complex, subunit D2 [ Mus musculus (house mouse) ] Gene ID: 68298, updated on 12-Aug-2019

Gene summary

Official Symbol Ncapd2 provided by MGI Official Full Name non-SMC condensin I complex, subunit D2 provided by MGI Primary source MGI:MGI:1915548 See related Ensembl:ENSMUSG00000038252 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as CNAP1; CAP-D2; mKIAA0159; 2810406C15Rik; 2810465G24Rik Expression Broad expression in CNS E11.5 (RPKM 53.3), thymus adult (RPKM 40.7) and 24 other tissues See more Orthologs human all

Genomic context

Location: 6; 6 F2 See Ncapd2 in Genome Data Viewer Exon count: 33

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 6 NC_000072.6 (125168007..125192100, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 6 NC_000072.5 (125118025..125141604, complement)

Chromosome 6 - NC_000072.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 13 transcripts

Gene: Ncapd2 ENSMUSG00000038252

Description non-SMC condensin I complex, subunit D2 [Source:MGI Symbol;Acc:MGI:1915548] Gene Synonyms 2810406C15Rik, 2810465G24Rik, CAP-D2, CNAP1 Location Chromosome 6: 125,168,007-125,191,701 reverse strand. GRCm38:CM000999.2 About this gene This gene has 13 transcripts (splice variants), 198 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ncapd2- ENSMUST00000043848.10 4629 1392aa ENSMUSP00000042260.4 Protein coding CCDS39636 A0A0R4J0H7 TSL:1 201 GENCODE basic APPRIS P1

Ncapd2- ENSMUST00000189959.1 604 162aa ENSMUSP00000139445.1 Protein coding - A0A087WNQ1 CDS 3' 212 incomplete TSL:3

Ncapd2- ENSMUST00000188762.6 877 94aa ENSMUSP00000140672.1 Nonsense mediated - A0A087WRK6 TSL:5 210 decay

Ncapd2- ENSMUST00000188119.1 894 No - Retained intron - - TSL:2 206 protein

Ncapd2- ENSMUST00000185624.6 825 No - Retained intron - - TSL:3 202 protein

Ncapd2- ENSMUST00000186667.6 745 No - Retained intron - - TSL:3 205 protein

Ncapd2- ENSMUST00000189706.6 688 No - Retained intron - - TSL:5 211 protein

Ncapd2- ENSMUST00000186210.1 682 No - Retained intron - - TSL:3 203 protein

Ncapd2- ENSMUST00000191080.1 670 No - Retained intron - - TSL:2 213 protein

Ncapd2- ENSMUST00000188665.1 646 No - Retained intron - - TSL:2 209 protein

Ncapd2- ENSMUST00000188306.1 630 No - Retained intron - - TSL:2 207 protein

Ncapd2- ENSMUST00000186561.1 617 No - Retained intron - - TSL:2 204 protein

Ncapd2- ENSMUST00000188410.6 480 No - Retained intron - - TSL:2 208 protein

Page 7 of 9 https://www.alphaknockout.com

43.70 kb Forward strand 125.16Mb 125.17Mb 125.18Mb 125.19Mb 125.20Mb Iffo1-204 >retained intron Mrpl51-202 >lncRNA (Comprehensive set...

Iffo1-206 >nonsense mediated decay Mrpl51-201 >protein coding

Iffo1-202 >protein coding Mrpl51-203 >retained intron

Iffo1-201 >protein coding Mir3098-201 >miRNA

Iffo1-205 >retained intron

Iffo1-207 >protein coding

Iffo1-203 >retained intron

Iffo1-209 >retained intron

Contigs AC166162.6 > Genes (Comprehensive set... < Gapdh-213retained intro

< Gapdh-202protein codin

< Gapdh-206retained intron < Ncapd2-209retained intron < Ncapd2-207retained intron < Ncapd2-203retained intron

< Gapdh-203protein coding < Ncapd2-208retained intron < Ncapd2-212protein coding

< Gapdh-201protein coding < Ncapd2-202retained intron < Scarna10-201snoRNA

< Gapdh-212protein coding < Ncapd2-205retained intron < Gm28967-201lncRNA

< Gapdh-210retained intron < Ncapd2-211retained intron

< Gapdh-204lncRNA < Ncapd2-213retained intron

< Gapdh-208lncRNA

< Gapdh-207protein coding

< Gapdh-209protein coding

< Gapdh-211lncRNA

< Gapdh-205retained intron

Regulatory Build

125.16Mb 125.17Mb 125.18Mb 125.19Mb 125.20Mb Reverse strand 43.70 kb

Regulation Legend

CTCF Enhancer Open Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000043848

< Ncapd2-201protein coding

Reverse strand 23.70 kb

ENSMUSP00000042... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Armadillo-type fold Pfam Condensin complex subunit 1, N-terminal Condensin complex subunit 1, C-terminal

PIRSF Condensin subunit 1 PANTHER Condensin subunit 1/Condensin-2 complex subunit D3

Condensin subunit 1 Gene3D Armadillo-like helical

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained missense variant splice region variant synonymous variant

Scale bar 0 200 400 600 800 1000 1392

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9