https://www.alphaknockout.com

Mouse Npdc1 Knockout Project (CRISPR/Cas9)

Objective: To create a Npdc1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Npdc1 (NCBI Reference Sequence: NM_008721 ; Ensembl: ENSMUSG00000015094 ) is located on Mouse 2. 9 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 9 (Transcript: ENSMUST00000071442). Exon 1~9 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygous null mice display no obvious abnormalities in viability, fertility, behavior, or brain morphology.

Exon 1 starts from about 0.1% of the coding region. Exon 1~9 covers 100.0% of the coding region. The size of effective KO region: ~5935 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9

Legends Exon of mouse Npdc1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(22.15% 443) | C(25.85% 517) | T(25.35% 507) | G(26.65% 533)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.6% 492) | C(23.7% 474) | T(28.4% 568) | G(23.3% 466)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 + 25401255 25403254 2000 browser details YourSeq 193 415 978 2000 91.5% chr3 - 86198393 86515882 317490 browser details YourSeq 191 410 976 2000 94.1% chr1 + 182112720 182132543 19824 browser details YourSeq 175 414 732 2000 89.9% chr1 - 57153560 57153802 243 browser details YourSeq 159 410 960 2000 82.9% chrX + 101702134 101702489 356 browser details YourSeq 155 461 972 2000 84.8% chr1 - 59760634 59760925 292 browser details YourSeq 146 421 944 2000 82.5% chr10 - 21117514 21117783 270 browser details YourSeq 146 418 971 2000 80.2% chr4 + 11424696 11425089 394 browser details YourSeq 140 450 1114 2000 90.2% chr14 + 25900858 26180805 279948 browser details YourSeq 138 411 558 2000 97.3% chr2 + 148489785 148489932 148 browser details YourSeq 138 408 558 2000 94.6% chr13 + 67588713 67588861 149 browser details YourSeq 137 431 972 2000 82.5% chr16 + 30401741 30402073 333 browser details YourSeq 136 410 558 2000 93.9% chr12 - 97190075 97190221 147 browser details YourSeq 135 411 557 2000 94.6% chr17 - 66011550 66011695 146 browser details YourSeq 134 411 558 2000 95.3% chr17 - 51154957 51155104 148 browser details YourSeq 133 418 558 2000 97.2% chr13 - 9288220 9288360 141 browser details YourSeq 132 378 558 2000 90.9% chr11 - 83296800 83296978 179 browser details YourSeq 132 414 558 2000 95.9% chr17 + 42853774 42853925 152 browser details YourSeq 132 411 558 2000 94.6% chr12 + 105699666 105699813 148 browser details YourSeq 131 418 558 2000 95.7% chr3 - 129607181 129607320 140

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 + 25409190 25411189 2000 browser details YourSeq 89 1493 1602 2000 90.9% chr2 - 132366884 132367032 149 browser details YourSeq 89 1486 1689 2000 89.4% chr14 + 59787786 59788098 313 browser details YourSeq 86 1488 1620 2000 86.5% chr8 + 6215512 6215669 158 browser details YourSeq 85 1505 1624 2000 91.3% chr1 + 152369823 152370344 522 browser details YourSeq 83 1485 1591 2000 89.8% chr8 - 7140852 7140966 115 browser details YourSeq 82 1498 1597 2000 92.0% chr4 + 141074509 141074649 141 browser details YourSeq 81 1487 1591 2000 90.2% chrX + 89436923 89437032 110 browser details YourSeq 81 1475 1591 2000 81.4% chr3 + 101331996 101332098 103 browser details YourSeq 81 1486 1587 2000 88.8% chr12 + 65064192 65064292 101 browser details YourSeq 81 1487 1591 2000 86.6% chr11 + 5161449 5161552 104 browser details YourSeq 80 1486 1587 2000 90.0% chr2 - 77466834 77466940 107 browser details YourSeq 80 1498 1596 2000 91.0% chr17 - 84495202 84495340 139 browser details YourSeq 80 1008 1591 2000 74.5% chr14 + 30991366 30991518 153 browser details YourSeq 79 1486 1591 2000 87.7% chr13 + 35933082 35933189 108 browser details YourSeq 78 1486 1591 2000 84.7% chr4 - 93846578 93846679 102 browser details YourSeq 78 1074 1591 2000 71.8% chr16 - 21973323 21973429 107 browser details YourSeq 78 1488 1587 2000 89.8% chr2 + 33514055 33514156 102 browser details YourSeq 78 1498 1591 2000 89.3% chr13 + 106541836 106541928 93 browser details YourSeq 77 1503 1596 2000 92.4% chr7 + 64051856 64052135 280

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Npdc1 neural proliferation, differentiation and control 1 [ Mus musculus (house mouse) ] Gene ID: 18146, updated on 12-Aug-2019

Gene summary

Official Symbol Npdc1 provided by MGI Official Full Name neural proliferation, differentiation and control 1 provided by MGI Primary source MGI:MGI:1099802 See related Ensembl:ENSMUSG00000015094 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as NPDC-1; AI314472 Expression Broad expression in adrenal adult (RPKM 116.4), whole brain E14.5 (RPKM 82.6) and 21 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 A3 See Npdc1 in Genome Data Viewer Exon count: 9

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (25403050..25409494)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (25258603..25265014)

Chromosome 2 - NC_000068.7

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 14 transcripts

Gene: Npdc1 ENSMUSG00000015094

Description neural proliferation, differentiation and control 1 [Source:MGI Symbol;Acc:MGI:1099802] Gene Synonyms NPDC-1 Location Chromosome 2: 25,399,351-25,409,494 forward strand. GRCm38:CM000995.2 About this gene This gene has 14 transcripts (splice variants), 221 orthologues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Npdc1-202 ENSMUST00000071442.11 1485 332aa ENSMUSP00000071387.5 Protein coding CCDS15771 Q64322 TSL:1 GENCODE basic APPRIS P2

Npdc1-201 ENSMUST00000055921.13 1412 341aa ENSMUSP00000049602.6 Protein coding - A2AJ21 TSL:5 GENCODE basic APPRIS ALT2

Npdc1-207 ENSMUST00000133409.1 1128 274aa ENSMUSP00000117773.1 Protein coding - A2AJ23 CDS 5' incomplete TSL:3

Npdc1-211 ENSMUST00000141567.7 754 231aa ENSMUSP00000116275.1 Protein coding - A2AJ20 CDS 3' incomplete TSL:3

Npdc1-213 ENSMUST00000154809.7 510 141aa ENSMUSP00000123386.1 Protein coding - A2AJ19 CDS 3' incomplete TSL:3

Npdc1-208 ENSMUST00000136138.7 1510 No protein - lncRNA - - TSL:5

Npdc1-204 ENSMUST00000128144.1 1159 No protein - lncRNA - - TSL:5

Npdc1-205 ENSMUST00000131185.7 1119 No protein - lncRNA - - TSL:2

Npdc1-212 ENSMUST00000144413.1 893 No protein - lncRNA - - TSL:2

Npdc1-206 ENSMUST00000132287.7 757 No protein - lncRNA - - TSL:5

Npdc1-203 ENSMUST00000124277.1 738 No protein - lncRNA - - TSL:3

Npdc1-214 ENSMUST00000156824.1 645 No protein - lncRNA - - TSL:5

Npdc1-210 ENSMUST00000141106.7 627 No protein - lncRNA - - TSL:3

Npdc1-209 ENSMUST00000138651.1 497 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

30.14 kb Forward strand 25.39Mb 25.40Mb 25.41Mb (Comprehensive set... Entpd2-201 >protein coding Npdc1-202 >protein coding

Entpd2-203 >lncRNA Npdc1-212 >lncRNA Npdc1-205 >lncRNA

Entpd2-202 >lncRNA Npdc1-213 >protein coding Npdc1-209 >lncRNA

Npdc1-203 >lncRNA Npdc1-204 >lncRNA

Npdc1-206 >lncRNA

Npdc1-201 >protein coding

Npdc1-211 >protein coding

Npdc1-210 >lncRNA

Npdc1-214 >lncRNA

Npdc1-208 >lncRNA

Npdc1-207 >protein coding

Contigs AL732557.4 > Regulatory Build

25.39Mb 25.40Mb 25.41Mb Reverse strand 30.14 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000071442

6.42 kb Forward strand

Npdc1-202 >protein coding

ENSMUSP00000071... Transmembrane heli... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Cleavage site (Sign... Pfam Neural proliferation differentiation control-1

PANTHER Neural proliferation differentiation control-1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 332

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9