https://www.alphaknockout.com

Mouse Ndrg2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Ndrg2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Ndrg2 (NCBI Reference Sequence: NM_013864 ; Ensembl: ENSMUSG00000004558 ) is located on Mouse 14. 16 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 16 (Transcript: ENSMUST00000004673). Exon 2~6 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Ndrg2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-30G2 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null allele develop various types of tumors, including T-cell lymphomas, and have a shorter lifespan. Homozygotes for a second null allele show vertebral transformations. Homozygotes for a third null allele show reduced astrogliosis and inflammatory response after brain injury.

Exon 2 starts from about 100% of the coding region. The knockout of Exon 2~6 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 1776 bp, and the size of intron 6 for 3'-loxP site insertion: 1123 bp. The size of effective cKO region: ~1983 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 9 10 16 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Ndrg2 cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8483bp) | A(20.58% 1746) | C(25.3% 2146) | T(26.46% 2245) | G(27.66% 2346)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 - 51911791 51914790 3000 browser details YourSeq 55 2536 2590 3000 100.0% chr1 + 177466292 177466346 55 browser details YourSeq 37 2400 2438 3000 100.0% chr16 - 11163726 11163766 41 browser details YourSeq 31 2290 2767 3000 42.5% chr1 - 129680120 129680158 39

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr14 - 51906808 51909807 3000 browser details YourSeq 71 253 428 3000 70.3% chr12 + 111022519 111022698 180 browser details YourSeq 64 248 385 3000 73.2% chr12 - 68310993 68311130 138 browser details YourSeq 58 295 420 3000 71.2% chrX + 164081770 164081885 116 browser details YourSeq 58 293 420 3000 74.6% chr12 + 44519046 44519173 128 browser details YourSeq 58 295 420 3000 71.6% chr1 + 168977281 168977395 115 browser details YourSeq 56 291 417 3000 73.6% chr12 - 79764546 79764672 127 browser details YourSeq 55 296 404 3000 75.3% chr19 + 7515018 7515126 109 browser details YourSeq 54 293 420 3000 71.1% chr16 + 21539073 21539200 128 browser details YourSeq 54 295 420 3000 79.7% chr11 + 103963427 103963540 114 browser details YourSeq 52 293 410 3000 72.1% chr15 - 103130845 103130962 118 browser details YourSeq 48 371 868 3000 62.5% chr11 + 58157951 58158069 119 browser details YourSeq 47 252 420 3000 96.1% chr11 - 86891683 86891852 170 browser details YourSeq 47 293 409 3000 77.7% chr12 + 87847540 87847657 118 browser details YourSeq 43 222 384 3000 93.9% chr16 - 16115456 16115618 163 browser details YourSeq 43 296 421 3000 86.3% chr3 + 66160938 66161061 124 browser details YourSeq 42 293 404 3000 68.8% chr15 + 38719946 38720057 112 browser details YourSeq 40 371 420 3000 97.7% chr6 + 90361863 90361912 50 browser details YourSeq 39 331 420 3000 93.5% chr12 + 76335481 76335570 90 browser details YourSeq 38 214 414 3000 95.3% chr5 - 17842555 17842757 203

Note: The 3000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Ndrg2 N-myc downstream regulated gene 2 [ Mus musculus (house mouse) ] Gene ID: 29811, updated on 24-Oct-2019

Gene summary

Official Symbol Ndrg2 provided by MGI Official Full Name N-myc downstream regulated gene 2 provided by MGI Primary source MGI:MGI:1352498 See related Ensembl:ENSMUSG00000004558 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Ndr2; SYLD; AI182517; AU040374 Expression Broad expression in liver adult (RPKM 416.8), heart adult (RPKM 317.4) and 15 other tissues See more Orthologs human all

Genomic context

Location: 14; 14 C1 See Ndrg2 in Genome Data Viewer

Exon count: 21

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 14 NC_000080.6 (51905271..51914004, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 14 NC_000080.5 (52524946..52533163, complement)

Chromosome 14 - NC_000080.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 13 transcripts

Gene: Ndrg2 ENSMUSG00000004558

Description N-myc downstream regulated gene 2 [Source:MGI Symbol;Acc:MGI:1352498] Gene Synonyms Ndr2 Location : 51,905,271-51,914,158 reverse strand. GRCm38:CM001007.2 About this gene This gene has 13 transcripts (splice variants), 179 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 19 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Ndrg2-202 ENSMUST00000111632.4 2103 357aa ENSMUSP00000107259.3 Protein coding CCDS49486 Q9QYG0 TSL:1 GENCODE basic APPRIS P1

Ndrg2-201 ENSMUST00000004673.14 2099 371aa ENSMUSP00000004673.7 Protein coding CCDS27047 Q9QYG0 TSL:1 GENCODE basic

Ndrg2-209 ENSMUST00000227237.1 1060 273aa ENSMUSP00000153938.1 Protein coding - A0A2I3BPW0 CDS 3' incomplete

Ndrg2-210 ENSMUST00000227402.1 896 238aa ENSMUSP00000154279.1 Protein coding - A0A2I3BQR5 CDS 3' incomplete

Ndrg2-211 ENSMUST00000228164.1 635 176aa ENSMUSP00000153830.1 Protein coding - A0A2I3BPL1 CDS 3' incomplete

Ndrg2-204 ENSMUST00000226184.1 602 114aa ENSMUSP00000154135.1 Protein coding - A0A2I3BQM5 CDS 3' incomplete

Ndrg2-207 ENSMUST00000226528.1 384 114aa ENSMUSP00000154716.1 Protein coding - A0A2I3BS20 CDS 3' incomplete

Ndrg2-212 ENSMUST00000228173.1 481 No protein - Retained intron - - -

Ndrg2-213 ENSMUST00000228620.1 439 No protein - Retained intron - - -

Ndrg2-205 ENSMUST00000226364.1 435 No protein - Retained intron - - -

Ndrg2-206 ENSMUST00000226366.1 427 No protein - Retained intron - - -

Ndrg2-203 ENSMUST00000226122.1 384 No protein - Retained intron - - -

Ndrg2-208 ENSMUST00000226698.1 508 No protein - lncRNA - - -

Page 6 of 8 https://www.alphaknockout.com

28.89 kb Forward strand 51.90Mb 51.91Mb 51.92Mb Slc39a2-202 >retained intron Tppp2-201 >protein coding (Comprehensive set...

Slc39a2-201 >protein coding Tppp2-202 >protein coding

Contigs AC125087.2 >

Genes (Comprehensive set... < Ndrg2-201protein coding < Rnase13-201protein coding

< Ndrg2-202protein coding

< Ndrg2-208lncRNA < Ndrg2-204protein coding

< Ndrg2-209protein coding

< Ndrg2-203retained intron

< Ndrg2-210protein coding

< Ndrg2-211protein coding

< Ndrg2-206retained intron

< Ndrg2-207protein coding

< Ndrg2-213retained intron

< Ndrg2-205retained intron

< Ndrg2-212retained intron

Regulatory Build

51.90Mb 51.91Mb 51.92Mb Reverse strand 28.89 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000004673

< Ndrg2-201protein coding

Reverse strand 8.16 kb

ENSMUSP00000004... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Superfamily Alpha/Beta hydrolase fold Pfam NDRG

PANTHER NDRG

Protein NDRG2 Gene3D Alpha/Beta hydrolase fold

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 371

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8