https://www.alphaknockout.com

Mouse Nrxn3 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Nrxn3 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Nrxn3 (NCBI Reference Sequence: NM_001198587 ; Ensembl: ENSMUSG00000066392 ) is located on Mouse 12. 20 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 20 (Transcript: ENSMUST00000163134). Exon 6 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Nrxn3 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-159E17 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Twenty percent of mice homozygous for a knock-out allele die postnatally prior to 20 days of age.

Exon 6 starts from about 25.93% of the coding region. The knockout of Exon 6 will result in frameshift of the gene. The size of intron 5 for 5'-loxP site insertion: 61466 bp, and the size of intron 6 for 3'-loxP site insertion: 5025 bp. The size of effective cKO region: ~939 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 6 20 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Nrxn3 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7439bp) | A(29.55% 2198) | C(20.23% 1505) | T(30.49% 2268) | G(19.73% 1468)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr12 + 89251425 89254424 3000 browser details YourSeq 39 2203 2274 3000 82.3% chr2 + 93396667 93396733 67 browser details YourSeq 36 2441 2515 3000 92.9% chr19 - 47162647 47162723 77 browser details YourSeq 35 2438 2507 3000 97.3% chr8 - 125558054 125558124 71 browser details YourSeq 33 2207 2518 3000 88.6% chr10 + 7997224 7997533 310 browser details YourSeq 32 1629 1721 3000 91.2% chr4 + 108676976 108677067 92 browser details YourSeq 27 2185 2224 3000 86.7% chr1 - 177414654 177414691 38 browser details YourSeq 27 1441 1479 3000 87.1% chr1 - 61044890 61044927 38 browser details YourSeq 26 2414 2451 3000 84.3% chr5 + 134724791 134724828 38 browser details YourSeq 26 2435 2466 3000 90.7% chr13 + 17584290 17584321 32 browser details YourSeq 24 860 890 3000 96.2% chr2 + 36158285 36158316 32 browser details YourSeq 20 2439 2460 3000 95.5% chr11 - 69534372 69534393 22

Note: The 3000 bp section upstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr12 + 89255364 89258363 3000 browser details YourSeq 55 1780 1869 3000 83.9% chr7 + 115156417 115156503 87 browser details YourSeq 44 1805 1861 3000 81.3% chr9 - 122042263 122042311 49 browser details YourSeq 43 1810 1884 3000 87.8% chr7 + 72911615 72911690 76 browser details YourSeq 42 1764 1829 3000 88.9% chr8 - 66774472 66774538 67 browser details YourSeq 42 1789 1835 3000 95.8% chr12 + 107326080 107326127 48 browser details YourSeq 40 1761 1819 3000 91.9% chr13 + 28706193 28706251 59 browser details YourSeq 37 1800 1842 3000 95.3% chr10 - 70762335 70762378 44 browser details YourSeq 36 1766 1825 3000 78.3% chr5 - 142091115 142091169 55 browser details YourSeq 36 1762 1825 3000 78.2% chr1 - 55573284 55573347 64 browser details YourSeq 36 1781 1824 3000 86.1% chr7 + 130273935 130273977 43 browser details YourSeq 35 1731 1818 3000 92.5% chr10 - 107607182 107607460 279 browser details YourSeq 35 1816 1883 3000 85.0% chr7 + 111499077 111499142 66 browser details YourSeq 35 1810 1851 3000 92.9% chr13 + 69276666 69276708 43 browser details YourSeq 34 1799 1836 3000 94.8% chr1 - 127665760 127665797 38 browser details YourSeq 33 1789 1829 3000 94.6% chr12 - 107137855 107137896 42 browser details YourSeq 33 1801 1834 3000 100.0% chr4 + 137166350 137166384 35 browser details YourSeq 33 1789 1821 3000 100.0% chr14 + 87149275 87149307 33 browser details YourSeq 32 1805 1860 3000 88.3% chr1 - 166759341 166759394 54 browser details YourSeq 32 2724 2765 3000 88.3% chr15 + 47982083 47982122 40

Note: The 3000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Nrxn3 III [ Mus musculus (house mouse) ] Gene ID: 18191, updated on 10-Oct-2019

Gene summary

Official Symbol Nrxn3 provided by MGI Official Full Name neurexin III provided by MGI Primary source MGI:MGI:1096389 See related Ensembl:ENSMUSG00000066392 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Summary This gene encodes a member of a family of that function in the nervous system as receptors and cell adhesion Expression molecules. Extensive alternative splicing and the use of alternative promoters results in multiple transcript variants for this gene, but the full-length nature of many of these variants has not been determined. Transcripts that initiate from an upstream promoter encode alpha isoforms, which contain epidermal growth factor-like (EGF-like) sequences and laminin G domains. Transcripts initiating from the downstream promoter encode beta isoforms, which lack EGF-like sequences. [provided by RefSeq, Dec 2012] Orthologs Biased expression in frontal lobe adult (RPKM 7.6), cerebellum adult (RPKM 6.7) and 6 other tissues See more human all

Genomic context

Location: 12 D3; 12 42.94 cM See Nrxn3 in Genome Data Viewer

Exon count: 36

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 12 NC_000078.6 (88722701..90334935)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 12 NC_000078.5 (90191834..91573375)

Chromosome 12 - NC_000078.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 14 transcripts

Gene: Nrxn3 ENSMUSG00000066392

Description neurexin III [Source:MGI Symbol;Acc:MGI:1096389] Gene Synonyms 4933401A11Rik, 9330112C09Rik, D12Bwg0831e, neurexin III alpha, neurexin III beta Location Chromosome 12: 88,722,876-90,334,935 forward strand. GRCm38:CM001005.2 About this gene This gene has 14 transcripts (splice variants), 155 orthologues, 35 paralogues, is a member of 2 Ensembl protein families and is associated with 6 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Nrxn3-205 ENSMUST00000163134.7 7922 1571aa ENSMUSP00000129678.1 Protein coding CCDS56856 Q6P9K9 TSL:5 GENCODE basic APPRIS P1

Nrxn3-201 ENSMUST00000057634.13 4634 1100aa ENSMUSP00000050075.7 Protein coding CCDS49134 Q6P9K9 TSL:1 GENCODE basic

Nrxn3-202 ENSMUST00000110130.3 3654 567aa ENSMUSP00000105757.3 Protein coding CCDS79143 Q8C985 TSL:2 GENCODE basic

Nrxn3-210 ENSMUST00000167887.7 3303 1100aa ENSMUSP00000127926.1 Protein coding CCDS49134 Q6P9K9 TSL:5 GENCODE basic

Nrxn3-208 ENSMUST00000167103.7 6944 1391aa ENSMUSP00000127407.1 Protein coding - E9Q2X2 TSL:5 GENCODE basic

Nrxn3-213 ENSMUST00000190626.6 5583 1009aa ENSMUSP00000139879.1 Protein coding - A0A087WPQ9 TSL:1 GENCODE basic

Nrxn3-214 ENSMUST00000238943.1 3418 635aa ENSMUSP00000159078.1 Protein coding - - GENCODE basic

Nrxn3-203 ENSMUST00000110133.8 2699 430aa ENSMUSP00000105760.2 Protein coding - E9Q3Q4 TSL:5 GENCODE basic

Nrxn3-209 ENSMUST00000167734.7 4086 No protein - Retained intron - - TSL:2

Nrxn3-204 ENSMUST00000110138.7 2278 No protein - Retained intron - - TSL:2

Nrxn3-206 ENSMUST00000163944.1 673 No protein - lncRNA - - TSL:3

Nrxn3-207 ENSMUST00000164072.1 646 No protein - lncRNA - - TSL:5

Nrxn3-212 ENSMUST00000190030.1 528 No protein - lncRNA - - TSL:2

Nrxn3-211 ENSMUST00000170533.1 525 No protein - lncRNA - - TSL:2

Page 6 of 8 https://www.alphaknockout.com

1.63 Mb Forward strand 89.0Mb 89.5Mb 90.0Mb (Comprehensive set... Nrxn3-209 >retained intron Gm23989-201 >miRNA Gm48692-201 >TEC Gm48700-201 >TEC

Nrxn3-204 >retained intronNrxn3-211 >lncRNA Nrxn3-210 >protein coding

Nrxn3-213 >protein coding

Nrxn3-208 >protein coding

Nrxn3-206 >lncRNA Nrxn3-201 >protein coding

Nrxn3-212 >lncRNA Nrxn3-214 >protein coding

Nrxn3-205 >protein coding

Nrxn3-207 >lncRNA Nrxn3-202 >protein coding

Nrxn3-203 >protein coding

Contigs < AC161049.2 < CR974583.24 Genes < Gm37980-201TEC (Comprehensive set...

< Gm27313-201miRNA

Regulatory Build

89.0Mb 89.5Mb 90.0Mb Reverse strand 1.63 Mb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000163134

1.54 Mb Forward strand

Nrxn3-205 >protein coding

ENSMUSP00000129... Transmembrane heli... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Superfamily Concanavalin A-like lectin/glucanase domain superfamily SMART EGF-like domain Neurexin/syndecan/glycophorin C

Laminin G domain Pfam EGF-like domain Syndecan/Neurexin domain

Laminin G domain PROSITE profiles EGF-like domain

Laminin G domain PROSITE patterns EGF-type aspartate/asparagine hydroxylation site PANTHER PTHR15036

PTHR15036:SF48 Gene3D 2.10.25.10

2.60.120.200 CDD cd00054

cd00110

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1200 1571

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8