https://www.alphaknockout.com

Mouse Net1 Knockout Project (CRISPR/Cas9)

Objective: To create a Net1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Net1 (NCBI Reference Sequence: NM_019671 ; Ensembl: ENSMUSG00000021215 ) is located on Mouse 13. 12 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 12 (Transcript: ENSMUST00000091853). Exon 4~11 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit delayed mammary gland development during puberty associated with slower ductal extension, reduced ductal branching and epithelial cell proliferation, disorganized myoepithelial and ductal epithelial cells, and increased collagen deposition.

Exon 4 starts from about 14.34% of the coding region. Exon 4~11 covers 63.25% of the coding region. The size of effective KO region: ~4176 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4 5 6 7 8 9 10 11 12

Legends Exon of mouse Net1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 329 bp section downstream of Exon 11 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(24.05% 481) | C(20.2% 404) | T(35.2% 704) | G(20.55% 411)

Note: The 2000 bp section upstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(329bp) | A(30.4% 100) | C(23.4% 77) | T(25.84% 85) | G(20.36% 67)

Note: The 329 bp section downstream of Exon 11 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr13 - 3888979 3890978 2000 browser details YourSeq 109 253 372 2000 96.7% chr5 - 118481305 118481805 501 browser details YourSeq 103 255 372 2000 94.1% chr14 + 63304167 63304285 119 browser details YourSeq 102 251 372 2000 92.2% chr10 + 34288866 34288985 120 browser details YourSeq 101 257 372 2000 94.0% chr3 - 43126832 43126959 128 browser details YourSeq 99 267 373 2000 96.3% chr10 - 120670536 120670642 107 browser details YourSeq 95 263 372 2000 93.7% chr8 - 13274889 13275000 112 browser details YourSeq 82 266 372 2000 92.8% chr5_JH584299_random + 428270 428377 108 browser details YourSeq 80 281 372 2000 92.1% chr2 + 84726331 84726420 90 browser details YourSeq 76 269 367 2000 92.3% chr17 + 18152722 18152839 118 browser details YourSeq 68 270 375 2000 85.4% chr11 - 103155336 103155439 104 browser details YourSeq 66 279 370 2000 93.5% chr11 - 55375829 55375921 93 browser details YourSeq 61 277 365 2000 84.3% chr18 - 63678910 63678998 89 browser details YourSeq 54 250 320 2000 84.1% chr2 + 165008665 165008733 69 browser details YourSeq 51 274 338 2000 89.3% chr19 - 3772372 3772436 65 browser details YourSeq 50 274 334 2000 84.3% chr7 + 82372262 82372318 57 browser details YourSeq 47 1206 1300 2000 94.3% chr13 + 111613365 111613471 107 browser details YourSeq 46 250 314 2000 86.0% chr9 - 58243953 58244018 66 browser details YourSeq 46 1226 1320 2000 92.6% chr6 + 83076522 83076856 335 browser details YourSeq 44 1227 1319 2000 85.3% chr1 - 4763682 4763785 104

Note: The 2000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 329 1 329 329 100.0% chr13 - 3884474 3884802 329 browser details YourSeq 45 74 154 329 92.4% chr10 - 4050861 4267602 216742 browser details YourSeq 40 55 110 329 83.4% chr9 + 57545695 57545749 55 browser details YourSeq 36 72 110 329 97.5% chr11 + 77010165 77010205 41 browser details YourSeq 35 62 103 329 81.6% chr12 - 76116567 76116604 38 browser details YourSeq 33 69 110 329 97.2% chr11 - 85301579 85301630 52 browser details YourSeq 33 72 125 329 74.3% chr6 + 143102421 143102462 42 browser details YourSeq 31 61 95 329 94.3% chr2 - 121258046 121258080 35 browser details YourSeq 30 73 112 329 91.5% chr9 + 121310835 121310877 43 browser details YourSeq 30 64 95 329 96.9% chr6 + 83658663 83658694 32 browser details YourSeq 30 31 95 329 96.9% chr10 + 94139314 94139412 99 browser details YourSeq 29 76 112 329 90.4% chr11 - 19868731 19868766 36 browser details YourSeq 29 69 101 329 94.0% chr5 + 123637741 123637773 33 browser details YourSeq 28 77 107 329 96.7% chr9 - 54597909 54597942 34 browser details YourSeq 28 79 107 329 100.0% chr1 - 180681271 180681314 44 browser details YourSeq 27 69 95 329 100.0% chr12 - 18509495 18509521 27 browser details YourSeq 27 69 95 329 100.0% chr11 - 51686155 51686181 27 browser details YourSeq 27 74 110 329 96.6% chr10 - 61166919 61166955 37 browser details YourSeq 27 69 103 329 88.6% chr1 + 175531927 175531961 35 browser details YourSeq 25 67 93 329 96.3% chr9 - 22000866 22000892 27

Note: The 329 bp section downstream of Exon 11 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Net1 neuroepithelial cell transforming gene 1 [ Mus musculus (house mouse) ] Gene ID: 56349, updated on 24-Sep-2019

Gene summary

Official Symbol Net1 provided by MGI Official Full Name neuroepithelial cell transforming gene 1 provided by MGI Primary source MGI:MGI:1927138 See related Ensembl:ENSMUSG00000021215 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Net1a; mNET1 Expression Ubiquitous expression in large intestine adult (RPKM 25.9), limb E14.5 (RPKM 18.9) and 28 other tissues See more Orthologs human all

Genomic context

Location: 13; 13 A1 See Net1 in Genome Data Viewer Exon count: 14

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 13 NC_000079.6 (3882018..3918220, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 13 NC_000079.5 (3881807..3917466, complement)

Chromosome 13 - NC_000079.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Net1 ENSMUSG00000021215

Description neuroepithelial cell transforming gene 1 [Source:MGI Symbol;Acc:MGI:1927138] Gene Synonyms 0610025H04Rik, 9530071N24Rik, Net1 homolog, mNET1 Location Chromosome 13: 3,882,018-3,918,220 reverse strand. GRCm38:CM001006.2 About this gene This gene has 7 transcripts (splice variants), 256 orthologues, 8 paralogues, is a member of 1 Ensembl protein family and is associated with 4 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Net1-202 ENSMUST00000099946.5 3871 541aa ENSMUSP00000097529.4 Protein coding CCDS36583 Q3USZ7 Q9Z206 TSL:1 GENCODE basic APPRIS ALT2

Net1-201 ENSMUST00000091853.11 2680 595aa ENSMUSP00000089464.4 Protein coding CCDS26216 Q9Z206 TSL:1 GENCODE basic APPRIS P3

Net1-207 ENSMUST00000223258.1 597 72aa ENSMUSP00000152333.1 Protein coding - A0A1Y7VJ80 TSL:1 GENCODE basic

Net1-206 ENSMUST00000222504.1 447 44aa ENSMUSP00000152173.1 Protein coding - A0A1Y7VKV6 TSL:3 GENCODE basic

Net1-203 ENSMUST00000220887.1 763 No protein - Retained intron - - TSL:5

Net1-204 ENSMUST00000222017.1 569 No protein - Retained intron - - TSL:NA

Net1-205 ENSMUST00000222442.1 519 No protein - Retained intron - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

56.20 kb Forward strand 3.88Mb 3.89Mb 3.90Mb 3.91Mb 3.92Mb Tubal3-202 >lncRNA (Comprehensive set...

Tubal3-201 >protein coding

Contigs AC139323.4 > Genes (Comprehensive set... < Net1-202protein coding < Gm47813-201TEC < Net1-207protein coding

< Net1-206protein coding

< Net1-201protein coding

< Net1-203retained intron < Net1-204retained intron

< Net1-205retained intron

Regulatory Build

3.88Mb 3.89Mb 3.90Mb 3.91Mb 3.92Mb Reverse strand 56.20 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000091853

< Net1-201protein coding

Reverse strand 34.89 kb

ENSMUSP00000089... MobiDB lite Low complexity (Seg) Superfamily Dbl homology (DH) domain superfamily SSF50729

SMART Dbl homology (DH) domain Pleckstrin homology domain

Pfam Dbl homology (DH) domain Pleckstrin homology domain

PROSITE profiles Dbl homology (DH) domain Pleckstrin homology domain

PROSITE patterns Guanine-nucleotide dissociation stimulator, CDC24, conserved site PANTHER PTHR46006:SF4

PTHR46006 Gene3D Dbl homology (DH) domain superfamily PH-like domain superfamily

CDD Dbl homology (DH) domain Net1, PH domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 595

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9