https://www.alphaknockout.com

Mouse Nckap1 Knockout Project (CRISPR/Cas9)

Objective: To create a Nckap1 knockout Mouse model (C57BL/6N) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Nckap1 (NCBI Reference Sequence: NM_001290745 ; Ensembl: ENSMUSG00000027002 ) is located on Mouse 2. 32 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 32 (Transcript: ENSMUST00000111760). Exon 4~10 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for disruptions in this gene exhibit growth arrest at midgestation, an open neural tube, cardia bifida, defective foregut development, defects in endoderm and mesoderm migration and sometimes duplication of the anteroposterior body axis.

Exon 4 starts from about 7.0% of the coding region. Exon 4~10 covers 21.4% of the coding region. The size of effective KO region: ~9801 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4 5 6 7 8 9 10 32

Legends Exon of mouse Nckap1 Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 10 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.35% 507) | C(18.3% 366) | T(35.9% 718) | G(20.45% 409)

Note: The 2000 bp section upstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(28.75% 575) | C(19.05% 381) | T(30.25% 605) | G(21.95% 439)

Note: The 2000 bp section downstream of Exon 10 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 80554791 80556790 2000 browser details YourSeq 121 159 454 2000 90.8% chr6 + 128468979 128469276 298 browser details YourSeq 113 216 447 2000 89.6% chr4 - 135791764 135792233 470 browser details YourSeq 98 222 452 2000 89.7% chr19 + 43842567 43842810 244 browser details YourSeq 87 335 447 2000 89.4% chr3 + 133612515 133612625 111 browser details YourSeq 84 335 447 2000 88.3% chr4 - 134429524 134429634 111 browser details YourSeq 84 216 424 2000 85.6% chr11 + 115573518 115573771 254 browser details YourSeq 82 159 447 2000 72.8% chr4 - 29163209 29163391 183 browser details YourSeq 82 335 447 2000 87.3% chr17 - 13819002 13819112 111 browser details YourSeq 82 335 452 2000 86.0% chr10 - 70302448 70302563 116 browser details YourSeq 81 335 454 2000 84.2% chr17 - 30641546 30641662 117 browser details YourSeq 81 336 452 2000 85.9% chr12 - 43874364 43874478 115 browser details YourSeq 81 335 447 2000 86.5% chr17 + 38299060 38299170 111 browser details YourSeq 81 159 447 2000 71.4% chr11 + 97255842 97255989 148 browser details YourSeq 80 335 447 2000 84.5% chr10 - 66941701 66941810 110 browser details YourSeq 80 337 446 2000 87.0% chr8 + 72691573 72691680 108 browser details YourSeq 79 335 447 2000 84.4% chr14 - 59125054 59125163 110 browser details YourSeq 79 339 454 2000 84.8% chr5 + 104711455 104711568 114 browser details YourSeq 79 218 395 2000 80.4% chr4 + 125159929 125160092 164 browser details YourSeq 79 345 452 2000 87.7% chr19 + 32391919 32392024 106

Note: The 2000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 80542990 80544989 2000 browser details YourSeq 136 1746 2000 2000 88.3% chr17 - 32902280 32902644 365 browser details YourSeq 121 1267 1877 2000 79.9% chr4 + 57366499 57367046 548 browser details YourSeq 118 1752 1928 2000 84.0% chr10 + 85068166 85068355 190 browser details YourSeq 112 1746 1905 2000 85.1% chr8 - 71102018 71102176 159 browser details YourSeq 111 1746 1921 2000 87.8% chr12 + 102896546 102896733 188 browser details YourSeq 110 1752 2000 2000 83.9% chr10 + 70062032 70062279 248 browser details YourSeq 106 1746 1905 2000 83.5% chr1 - 117820213 117820371 159 browser details YourSeq 106 1746 1905 2000 81.8% chr11 + 103426754 103426912 159 browser details YourSeq 105 1747 1905 2000 83.4% chr4 + 90275132 90275289 158 browser details YourSeq 103 1746 1896 2000 82.7% chr13 - 59660677 59660826 150 browser details YourSeq 102 1732 1876 2000 86.5% chr19 + 13075846 13076170 325 browser details YourSeq 100 1509 1876 2000 75.6% chr5 - 25823017 25823162 146 browser details YourSeq 100 1763 1989 2000 87.5% chr16 + 33604906 33605139 234 browser details YourSeq 100 1746 1905 2000 82.0% chr14 + 14483311 14483469 159 browser details YourSeq 100 1738 1877 2000 84.2% chr13 + 66951403 66951541 139 browser details YourSeq 98 1760 1989 2000 80.0% chr11 - 114098868 114099049 182 browser details YourSeq 96 1746 1877 2000 83.9% chr1 - 11991772 11991901 130 browser details YourSeq 96 1746 1877 2000 84.8% chr18 + 46838046 46838176 131 browser details YourSeq 96 1745 1876 2000 84.8% chr1 + 99520350 99520480 131

Note: The 2000 bp section downstream of Exon 10 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Nckap1 NCK-associated protein 1 [ Mus musculus (house mouse) ] Gene ID: 50884, updated on 12-Aug-2019

Gene summary

Official Symbol Nckap1 provided by MGI Official Full Name NCK-associated protein 1 provided by MGI Primary source MGI:MGI:1355333 See related Ensembl:ENSMUSG00000027002 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as H19; Hem2; Nap1; mh19; Hem-2; C79304; p125Nap1; mKIAA0587 Expression Ubiquitous expression in cortex adult (RPKM 59.6), frontal lobe adult (RPKM 57.9) and 26 other tissues See more Orthologs human all

Genomic context

Location: 2 C3; 2 48.21 cM See Nckap1 in Genome Data Viewer Exon count: 32

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (80500512..80581182, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (80341452..80421122, complement)

Chromosome 2 - NC_000068.7

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Nckap1 ENSMUSG00000027002

Description NCK-associated protein 1 [Source:MGI Symbol;Acc:MGI:1355333] Gene Synonyms H19, Hem-2, Hem2, Nap1, mh19 Location : 80,500,512-80,581,380 reverse strand. GRCm38:CM000995.2 About this gene This gene has 5 transcripts (splice variants), 210 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 24 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Nckap1-202 ENSMUST00000111760.2 4597 1134aa ENSMUSP00000107390.2 Protein coding CCDS71082 A2AS98 TSL:1 GENCODE basic APPRIS ALT1

Nckap1-201 ENSMUST00000028386.11 4469 1128aa ENSMUSP00000028386.5 Protein coding CCDS16177 P28660 TSL:1 GENCODE basic APPRIS P3

Nckap1-203 ENSMUST00000131872.7 885 No protein - lncRNA - - TSL:2

Nckap1-205 ENSMUST00000154793.1 687 No protein - lncRNA - - TSL:5

Nckap1-204 ENSMUST00000134587.1 498 No protein - lncRNA - - TSL:5

100.87 kb Forward strand 80.50Mb 80.52Mb 80.54Mb 80.56Mb 80.58Mb Gm13687-201 >processed pseudogene Gm24461-201 >snRNA (Comprehensive set...

Contigs AL928578.6 > Genes < Nckap1-201protein coding (Comprehensive set...

< Nckap1-202protein coding

< Nckap1-205lncRNA < Nckap1-203lncRNA

< Nckap1-204lncRNA

< Gm13689-201processed pseudogene

Regulatory Build

80.50Mb 80.52Mb 80.54Mb 80.56Mb 80.58Mb Reverse strand 100.87 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene pseudogene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000111760

< Nckap1-202protein coding

Reverse strand 80.86 kb

ENSMUSP00000107... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam Nck-associated protein 1 PANTHER Nck-associated protein 1

PTHR12093:SF11

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained missense variant splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1000 1134

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8