https://www.alphaknockout.com

Mouse Snx21 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Snx21 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Snx21 (NCBI Reference Sequence: NM_133924 ; Ensembl: ENSMUSG00000050373 ) is located on Mouse 2. 3 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 3 (Transcript: ENSMUST00000056181). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Snx21 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-353P18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 23.88% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 373 bp, and the size of intron 2 for 3'-loxP site insertion: 4857 bp. The size of effective cKO region: ~630 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Snx21 cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7030bp) | A(24.77% 1741) | C(24.05% 1691) | T(25.55% 1796) | G(25.63% 1802)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 164783513 164786512 3000 browser details YourSeq 295 57 1170 3000 91.0% chr10 + 39544296 39918871 374576 browser details YourSeq 273 35 1184 3000 93.7% chr1 - 153582152 153813934 231783 browser details YourSeq 178 18 689 3000 84.4% chr5 + 146730088 146730335 248 browser details YourSeq 165 18 216 3000 90.3% chr7 + 127762796 127762980 185 browser details YourSeq 162 23 218 3000 92.1% chr2 - 174573365 174573552 188 browser details YourSeq 159 17 218 3000 89.4% chr1 - 160921520 160921712 193 browser details YourSeq 159 73 776 3000 92.6% chr9 + 106818959 107244208 425250 browser details YourSeq 158 57 410 3000 89.4% chr1 + 181164689 181165023 335 browser details YourSeq 157 23 218 3000 89.2% chr1 - 72555755 72555930 176 browser details YourSeq 156 41 218 3000 94.9% chr1 - 23901975 23902159 185 browser details YourSeq 154 17 215 3000 88.4% chr3 - 95657974 95658147 174 browser details YourSeq 152 22 210 3000 93.2% chr5 + 113810800 113811019 220 browser details YourSeq 151 16 215 3000 90.4% chr3 + 109629607 109629778 172 browser details YourSeq 151 20 213 3000 88.2% chr13 + 65573339 65573507 169 browser details YourSeq 150 19 216 3000 91.2% chr11 - 62880120 62880296 177 browser details YourSeq 150 18 238 3000 91.7% chr11 - 9336344 9336789 446 browser details YourSeq 150 52 216 3000 94.3% chr1 - 64786037 64786195 159 browser details YourSeq 150 18 217 3000 88.5% chr9 + 31197601 31197766 166 browser details YourSeq 149 25 216 3000 90.2% chr11 + 16452280 16452464 185

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 164787107 164790106 3000 browser details YourSeq 189 2340 2682 3000 87.9% chr4 - 75146514 75146755 242 browser details YourSeq 187 2340 2763 3000 86.8% chr16 - 20352173 20352387 215 browser details YourSeq 182 2348 2692 3000 95.1% chrX + 13228014 13228567 554 browser details YourSeq 181 2331 2547 3000 93.0% chr18 + 77099156 77099373 218 browser details YourSeq 180 2336 2534 3000 95.5% chrX - 103974958 103975158 201 browser details YourSeq 180 2336 2534 3000 95.5% chr3 - 87881983 87882183 201 browser details YourSeq 180 2336 2534 3000 95.5% chr13 - 91145779 91145979 201 browser details YourSeq 180 2337 2534 3000 95.5% chr14 + 55696274 55696471 198 browser details YourSeq 180 2338 2534 3000 96.0% chr1 + 170929608 170929806 199 browser details YourSeq 179 2336 2534 3000 96.4% chr19 + 46024250 46024453 204 browser details YourSeq 178 2335 2534 3000 95.0% chr11 - 106270939 106665365 394427 browser details YourSeq 178 2337 2534 3000 95.5% chr11 - 30656611 30656832 222 browser details YourSeq 178 2338 2534 3000 95.5% chr5 + 12845697 12845895 199 browser details YourSeq 177 2340 2534 3000 95.4% chr8 - 109822799 109822993 195 browser details YourSeq 177 2338 2534 3000 95.9% chr4 - 141504967 141505169 203 browser details YourSeq 177 2336 2534 3000 95.5% chr17 - 12694882 12695081 200 browser details YourSeq 177 2326 2542 3000 90.1% chr16 - 17742720 17742922 203 browser details YourSeq 177 2336 2547 3000 91.9% chr11 - 76411416 76411620 205 browser details YourSeq 177 2334 2534 3000 94.1% chr2 + 180003855 180004055 201

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Snx21 family member 21 [ Mus musculus (house mouse) ] Gene ID: 101113, updated on 14-Aug-2019

Gene summary

Official Symbol Snx21 provided by MGI Official Full Name sorting nexin family member 21 provided by MGI Primary source MGI:MGI:1917729 See related Ensembl:ENSMUSG00000050373 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI481716; 5730407K14Rik Expression Ubiquitous expression in adrenal adult (RPKM 51.8), ovary adult (RPKM 18.5) and 26 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 H3 See Snx21 in Genome Data Viewer

Exon count: 4

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (164785759..164793810)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (164611521..164618270)

Chromosome 2 - NC_000068.7

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Snx21 ENSMUSG00000050373

Description sorting nexin family member 21 [Source:MGI Symbol;Acc:MGI:1917729] Gene Synonyms 5730407K14Rik Location Chromosome 2: 164,785,823-164,793,816 forward strand. GRCm38:CM000995.2 About this gene This gene has 7 transcripts (splice variants), 185 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Snx21-201 ENSMUST00000056181.6 2590 363aa ENSMUSP00000054137.6 Protein coding CCDS38328 Q3UR97 TSL:1 GENCODE basic APPRIS P1

Snx21-204 ENSMUST00000172577.7 1825 142aa ENSMUSP00000134133.1 Protein coding - G3UYL5 TSL:2 GENCODE basic

Snx21-206 ENSMUST00000174070.7 1446 148aa ENSMUSP00000133344.1 Protein coding - G3UWM2 TSL:5 GENCODE basic

Snx21-203 ENSMUST00000152471.1 1243 151aa ENSMUSP00000133914.1 Protein coding - G3UY23 TSL:1 GENCODE basic

Snx21-202 ENSMUST00000140519.1 383 48aa ENSMUSP00000134256.1 Protein coding - G3UYX2 CDS 5' incomplete TSL:3

Snx21-205 ENSMUST00000173945.1 3262 No protein - Retained intron - - TSL:1

Snx21-207 ENSMUST00000174342.1 1604 No protein - lncRNA - - TSL:1

Page 6 of 8 https://www.alphaknockout.com

27.99 kb Forward strand

164.78Mb 164.79Mb 164.80Mb Ube2c-201 >protein coding Snx21-206 >protein coding (Comprehensive set...

Snx21-201 >protein coding

Snx21-204 >protein coding

Snx21-203 >protein coding Snx21-205 >retained intron

Snx21-202 >protein coding

Snx21-207 >lncRNA

Contigs AL591127.12 >

Genes < Tnnc2-201protein coding < Acot8-204nonsense mediated decay (Comprehensive set...

< Tnnc2-202lncRNA < Acot8-201protein coding

< Acot8-202protein coding

< Acot8-203retained intron < Acot8-205retained intron

Regulatory Build

164.78Mb 164.79Mb 164.80Mb Reverse strand 27.99 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000056181

7.82 kb Forward strand

Snx21-201 >protein coding

ENSMUSP00000054... PDB-ENSP mappings MobiDB lite Low complexity (Seg) Superfamily Tetratricopeptide-like helical domain superfamily

PX domain superfamily SMART Phox homologous domain Pfam Phox homologous domain PROSITE profiles Phox homologous domain PANTHER Sorting nexin SNX20/SNX21

PTHR20939:SF10 Gene3D PX domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 363

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8