https://www.alphaknockout.com

Mouse Ndfip1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Ndfip1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ndfip1 (NCBI Reference Sequence: NM_022996 ; Ensembl: ENSMUSG00000024425 ) is located on Mouse 18. 8 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 7 (Transcript: ENSMUST00000236085). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Ndfip1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-388B21 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a gene-trapped allele develop severe inflammation of the skin and lung due to T-cell hyperactivation and abnormal T-helper 2 physiology, and die prematurely. Mice homozygous for a null allele exhibit hypersensitivity of dopaminergic neurons to iron toxicity.

Exon 3 starts from about 22.93% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 4686 bp, and the size of intron 3 for 3'-loxP site insertion: 3733 bp. The size of effective cKO region: ~631 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 8 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Ndfip1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7131bp) | A(29.04% 2071) | C(18.13% 1293) | T(30.87% 2201) | G(21.96% 1566)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr18 + 38444442 38447441 3000 browser details YourSeq 152 2170 2351 3000 94.3% chrX + 102202330 102202517 188 browser details YourSeq 150 2178 2351 3000 93.2% chr12 + 25088826 25088999 174 browser details YourSeq 149 2122 2352 3000 90.3% chr11 - 85859000 85859294 295 browser details YourSeq 147 1998 2325 3000 84.1% chr7 + 44940031 44940318 288 browser details YourSeq 146 2177 2349 3000 92.5% chr17 - 62094723 62094898 176 browser details YourSeq 144 2177 2349 3000 92.0% chr15 - 73355111 73355286 176 browser details YourSeq 144 2181 2349 3000 92.9% chr11 - 84461838 84462009 172 browser details YourSeq 144 2178 2349 3000 92.4% chr16 + 96096316 96096490 175 browser details YourSeq 143 2170 2352 3000 89.9% chr2 - 38903816 38904003 188 browser details YourSeq 143 2168 2565 3000 92.4% chr18 - 73879734 73880142 409 browser details YourSeq 143 2178 2350 3000 92.0% chr12 - 72742162 72742338 177 browser details YourSeq 143 2170 2350 3000 89.6% chr5 + 123071915 123072095 181 browser details YourSeq 143 2013 2326 3000 84.6% chr18 + 10550954 10551190 237 browser details YourSeq 143 2178 2350 3000 93.4% chr14 + 59415005 59415575 571 browser details YourSeq 142 2170 2352 3000 89.6% chr13 - 40625806 40625992 187 browser details YourSeq 140 1998 2325 3000 83.6% chr11 - 3207847 3208103 257 browser details YourSeq 140 2177 2349 3000 90.7% chr10 - 39179398 39179573 176 browser details YourSeq 140 2178 2350 3000 90.8% chr3 + 130210679 130211020 342 browser details YourSeq 139 2170 2325 3000 95.5% chr7 - 141073007 141073311 305

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr18 + 38448073 38451072 3000 browser details YourSeq 233 2387 2780 3000 87.0% chr13 - 30056213 30056643 431 browser details YourSeq 226 2403 2763 3000 86.2% chr5 - 132634360 132634716 357 browser details YourSeq 224 2387 2720 3000 87.7% chr3 - 135750346 135750707 362 browser details YourSeq 213 2394 2712 3000 86.5% chrX - 12408352 12408649 298 browser details YourSeq 212 2408 2773 3000 87.6% chr13 - 59750987 59751354 368 browser details YourSeq 208 2415 2774 3000 87.9% chr3 + 87029855 87030524 670 browser details YourSeq 201 2394 2775 3000 81.9% chr1 - 190732125 190732500 376 browser details YourSeq 199 2393 2720 3000 84.6% chr1 - 136566594 136566886 293 browser details YourSeq 199 2394 2775 3000 90.3% chr4 + 106286907 106287299 393 browser details YourSeq 196 2403 2690 3000 87.4% chr2 - 147173689 147173994 306 browser details YourSeq 193 2394 2686 3000 88.2% chr17 - 87701073 88092466 391394 browser details YourSeq 193 2394 2766 3000 90.2% chr6 + 34286318 34286710 393 browser details YourSeq 193 2403 2715 3000 86.0% chr10 + 59098497 59098801 305 browser details YourSeq 192 2406 2715 3000 86.3% chr4 + 128854607 128854896 290 browser details YourSeq 191 2389 2773 3000 83.4% chr16 - 56183498 56183848 351 browser details YourSeq 191 2406 2762 3000 86.9% chr1 + 24635268 24635662 395 browser details YourSeq 190 2406 2711 3000 87.8% chr11 - 109626476 109841040 214565 browser details YourSeq 190 2389 2694 3000 85.3% chr10 - 63364542 63364833 292 browser details YourSeq 190 2408 2721 3000 88.6% chr2 + 118338656 118338976 321

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Ndfip1 Nedd4 family interacting protein 1 [ Mus musculus (house mouse) ] Gene ID: 65113, updated on 24-Oct-2019

Gene summary

Official Symbol Ndfip1 provided by MGI Official Full Name Nedd4 family interacting protein 1 provided by MGI Primary source MGI:MGI:1929601 See related Ensembl:ENSMUSG00000024425 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as N4wbp5; 0610010M22Rik Expression Ubiquitous expression in adrenal adult (RPKM 223.7), CNS E18 (RPKM 122.4) and 28 other tissues See more Orthologs human all

Genomic context

Location: 18; 18 B3 See Ndfip1 in Genome Data Viewer

Exon count: 8

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (38418907..38465197)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (38578629..38624060)

Chromosome 18 - NC_000084.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 8 transcripts

Gene: Ndfip1 ENSMUSG00000024425

Description Nedd4 family interacting protein 1 [Source:MGI Symbol;Acc:MGI:1929601] Gene Synonyms 0610010M22Rik Location Chromosome 18: 38,410,396-38,465,303 forward strand. GRCm38:CM001011.2 About this gene This gene has 8 transcripts (splice variants), 250 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 21 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ndfip1- ENSMUST00000236085.1 1796 221aa ENSMUSP00000158314.1 Protein coding CCDS37789 Q8R0W6 GENCODE 204 basic APPRIS P2

Ndfip1- ENSMUST00000025293.4 1392 221aa ENSMUSP00000025293.3 Protein coding CCDS37789 Q8R0W6 TSL:1 201 GENCODE basic APPRIS P2

Ndfip1- ENSMUST00000236052.1 1723 214aa ENSMUSP00000157954.1 Protein coding - A0A494BAA4 GENCODE 203 basic APPRIS ALT2

Ndfip1- ENSMUST00000236171.1 1237 164aa ENSMUSP00000157630.1 Protein coding - A0A494B9F5 CDS 5' 205 incomplete

Ndfip1- ENSMUST00000236480.1 838 167aa ENSMUSP00000158119.1 Nonsense mediated - A0A494BAK0 - 206 decay

Ndfip1- ENSMUST00000238135.1 4364 No - Retained intron - - - 208 protein

Ndfip1- ENSMUST00000235458.1 817 No - Retained intron - - - 202 protein

Ndfip1- ENSMUST00000236803.1 729 No - Retained intron - - - 207 protein

Page 6 of 8 https://www.alphaknockout.com

74.91 kb Forward strand

Genes (Comprehensive set... Ndfip1-203 >protein coding

Ndfip1-204 >protein coding

Ndfip1-207 >retained intron Ndfip1-208 >retained intron

Ndfip1-201 >protein coding

Ndfip1-206 >nonsense mediated decay

Ndfip1-202 >retained intron

Ndfip1-205 >protein coding

Contigs AC134576.3 > < Gm50344-201processed pseudogene (Comprehensive set...

Regulatory Build

Reverse strand 74.91 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

pseudogene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000236085

46.29 kb Forward strand

Ndfip1-204 >protein coding

ENSMUSP00000158... Transmembrane heli... MobiDB lite Low complexity (Seg) Cleavage site (Sign... Pfam NEDD4/Bsd2 PANTHER NEDD4/Bsd2

PTHR13396:SF3

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 20 40 60 80 100 120 140 160 180 200 221

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8