https://www.alphaknockout.com

Mouse Nrd1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Nrd1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Nrd1 (NCBI Reference Sequence: NM_146150 ; Ensembl: ENSMUSG00000053510 ) is located on Mouse 4. 31 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 31 (Transcript: ENSMUST00000065977). Exon 4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Nrd1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-315E18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele mostly die within 48 hours of birth with surviving mice exhibiting cortical thinning, enlarged lateral ventricles, hypomyelination, reduced grip strength, impaired coordination, and impaired spatial working memory.

Exon 4 starts from about 21.42% of the coding region. The knockout of Exon 4 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 1037 bp, and the size of intron 4 for 3'-loxP site insertion: 5092 bp. The size of effective cKO region: ~654 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 4 31 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Nrd1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7154bp) | A(30.58% 2188) | C(17.19% 1230) | T(32.44% 2321) | G(19.78% 1415)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 + 109015709 109018708 3000 browser details YourSeq 74 2421 2620 3000 92.1% chr1 + 119900235 119900456 222 browser details YourSeq 68 2436 2619 3000 81.7% chr16 + 5314936 5315486 551 browser details YourSeq 67 2446 2631 3000 81.0% chr11 + 55177855 55178077 223 browser details YourSeq 60 2522 2629 3000 94.2% chr3 + 28576913 28577090 178 browser details YourSeq 59 2437 2572 3000 81.2% chr1 + 167413339 167413464 126 browser details YourSeq 56 2440 2610 3000 81.9% chr9 + 14684234 14684425 192 browser details YourSeq 55 2442 2535 3000 84.9% chr11 + 116587463 116587557 95 browser details YourSeq 53 2460 2628 3000 92.1% chr14 - 86294089 86294259 171 browser details YourSeq 52 2464 2616 3000 94.9% chr5 - 4022026 4022205 180 browser details YourSeq 52 2440 2616 3000 92.0% chr11 + 21196001 21196206 206 browser details YourSeq 51 2436 2616 3000 91.9% chr10 - 67310193 67310404 212 browser details YourSeq 51 2458 2536 3000 88.0% chr6 + 114056195 114056271 77 browser details YourSeq 49 2444 2539 3000 80.0% chr9 - 113642796 113642890 95 browser details YourSeq 47 2458 2537 3000 75.7% chr11 - 114046328 114046405 78 browser details YourSeq 47 2426 2516 3000 79.8% chr11 - 80195659 80195756 98 browser details YourSeq 47 2458 2518 3000 88.6% chr2 + 174447921 174447981 61 browser details YourSeq 47 2548 2628 3000 92.9% chr12 + 84024469 84024794 326 browser details YourSeq 47 2553 2630 3000 94.5% chr12 + 78186446 78186523 78 browser details YourSeq 46 2439 2516 3000 92.6% chr4 + 134379604 134379683 80

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 + 109019363 109022362 3000 browser details YourSeq 52 1350 1427 3000 95.0% chr1 + 189724892 189725115 224 browser details YourSeq 46 1354 1407 3000 96.3% chr1 + 73454983 73455065 83 browser details YourSeq 45 67 212 3000 88.0% chr1 - 83757490 83757667 178 browser details YourSeq 45 1352 1400 3000 98.0% chr10 + 100937814 100938014 201 browser details YourSeq 45 1350 1401 3000 94.2% chr10 + 24941079 24941134 56 browser details YourSeq 45 1352 1400 3000 96.0% chr10 + 5058663 5058711 49 browser details YourSeq 44 1352 1400 3000 96.0% chr10 + 78844797 78844849 53 browser details YourSeq 44 1352 1400 3000 96.0% chr1 + 14949748 14949798 51 browser details YourSeq 43 1352 1400 3000 96.0% chr1 + 128552413 128552481 69 browser details YourSeq 42 1350 1395 3000 97.9% chr1 - 165035104 165035163 60 browser details YourSeq 42 1350 1395 3000 97.9% chr10 + 80362565 80362819 255 browser details YourSeq 42 1352 1400 3000 87.3% chr10 + 5058645 5058691 47 browser details YourSeq 42 1350 1392 3000 100.0% chr1 + 160473084 160473172 89 browser details YourSeq 41 1350 1394 3000 97.8% chr1 - 110573755 110573811 57 browser details YourSeq 41 1350 1395 3000 95.7% chr1 - 4258690 4258852 163 browser details YourSeq 41 1350 1394 3000 95.6% chr1 - 4258657 4258701 45 browser details YourSeq 40 1350 1394 3000 95.6% chr1 - 76730075 76730165 91 browser details YourSeq 40 1352 1394 3000 97.7% chr1 - 76730141 76730187 47 browser details YourSeq 40 1352 1394 3000 97.7% chr1 - 25105094 25105210 117

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Nrd1 nardilysin, N-arginine dibasic convertase, NRD convertase 1 [ Mus musculus (house mouse) ] Gene ID: 230598, updated on 10-Oct-2019

Gene summary

Official Symbol Nrd1 provided by MGI Official Full Name nardilysin, N-arginine dibasic convertase, NRD convertase 1 provided by MGI Primary source MGI:MGI:1201386 See related Ensembl:ENSMUSG00000053510 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Nrdc; NRD-C; AI875733; 2600011I06Rik Expression Broad expression in testis adult (RPKM 91.8), CNS E11.5 (RPKM 25.7) and 22 other tissues See more Orthologs human all

Genomic context

Location: 4; 4 C7 See Nrd1 in Genome Data Viewer

Exon count: 33

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (109000655..109061777)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (108673410..108734376)

Chromosome 4 - NC_000070.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 12 transcripts

Gene: Nrd1 ENSMUSG00000053510

Description nardilysin, N-arginine dibasic convertase, NRD convertase 1 [Source:MGI Symbol;Acc:MGI:1201386] Gene Synonyms NRD-C Location Chromosome 4: 109,000,655-109,061,777 forward strand. GRCm38:CM000997.2 About this gene This gene has 12 transcripts (splice variants), 287 orthologues, 6 paralogues, is a member of 1 Ensembl protein family and is associated with 15 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Nrd1-201 ENSMUST00000065977.10 4311 1161aa ENSMUSP00000068328.4 Protein coding CCDS18460 Q8BHG1 TSL:1 GENCODE basic APPRIS P3

Nrd1-203 ENSMUST00000106644.8 3784 1229aa ENSMUSP00000102255.2 Protein coding CCDS84773 A2A9Q2 TSL:5 GENCODE basic APPRIS ALT2

Nrd1-202 ENSMUST00000102736.8 4179 1117aa ENSMUSP00000099797.2 Protein coding - A6PWC3 TSL:1 GENCODE basic APPRIS ALT2

Nrd1-205 ENSMUST00000125645.1 2190 395aa ENSMUSP00000122808.1 Protein coding - Q3V3G9 CDS 3' incomplete TSL:1

Nrd1-207 ENSMUST00000143604.1 966 No protein - Retained intron - - TSL:2

Nrd1-204 ENSMUST00000125063.1 781 No protein - Retained intron - - TSL:5

Nrd1-209 ENSMUST00000150177.7 604 No protein - Retained intron - - TSL:2

Nrd1-212 ENSMUST00000155228.1 560 No protein - Retained intron - - TSL:2

Nrd1-206 ENSMUST00000143015.1 534 No protein - Retained intron - - TSL:2

Nrd1-211 ENSMUST00000150796.1 164 No protein - Retained intron - - TSL:3

Nrd1-210 ENSMUST00000150784.7 1958 No protein - lncRNA - - TSL:1

Nrd1-208 ENSMUST00000148444.1 363 No protein - lncRNA - - TSL:5

Page 6 of 8 https://www.alphaknockout.com

81.12 kb Forward strand

109.00Mb 109.02Mb 109.04Mb 109.06Mb Nrd1-202 >protein coding (Comprehensive set...

Nrd1-201 >protein coding

Nrd1-203 >protein coding

Nrd1-205 >protein coding Nrd1-208 >lncRNA Nrd1-206 >retained intron

Mir761-201 >miRNA Nrd1-209 >retained intron Nrd1-211 >retained intron

Nrd1-212 >retained intron Nrd1-207 >retained intron Nrd1-204 >retained intron

Nrd1-210 >lncRNA

Contigs AL627406.15 > Genes < Gm23589-201snRNA < Osbpl9-211nonsense mediated decay (Comprehensive set...

< Osbpl9-202protein coding

< Osbpl9-219protein coding

< Osbpl9-213protein coding

< Osbpl9-201protein coding

< Osbpl9-214protein coding

< Osbpl9-218protein coding

< Osbpl9-216protein coding

< Osbpl9-210lncRNA

< Osbpl9-209lncRNA

< Osbpl9-205lncRNA

< Osbpl9-203lncRNA

Regulatory Build

109.00Mb 109.02Mb 109.04Mb 109.06Mb Reverse strand 81.12 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000065977

61.12 kb Forward strand

Nrd1-201 >protein coding

ENSMUSP00000068... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Metalloenzyme, LuxS/M16 peptidase-like Pfam Peptidase M16, N-terminal Peptidase M16, middle/third domain

Peptidase M16, C-terminal PROSITE patterns Peptidase M16, zinc-binding site PANTHER PTHR43690:SF14

PTHR43690 Gene3D 3.30.830.10

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend stop gained inframe insertion missense variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1000 1161

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8