https://www.alphaknockout.com

Mouse Dym Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Dym conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Dym (NCBI Reference Sequence: NM_027727 ; Ensembl: ENSMUSG00000035765 ) is located on Mouse 18. 17 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 17 (Transcript: ENSMUST00000039608). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Dym gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-46L22 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a gene trapped allele display decreased body size with short tubular bones, chondrodysplasia, partial penetrance of obstructive hydronephrosis and impaired vesicular transport.

Exon 3 starts from about 7.03% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 9711 bp, and the size of intron 3 for 3'-loxP site insertion: 2247 bp. The size of effective cKO region: ~553 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 4 17 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Dym Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7053bp) | A(23.89% 1685) | C(21.11% 1489) | T(31.83% 2245) | G(23.17% 1634)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr18 + 75049732 75052731 3000 browser details YourSeq 127 198 2596 3000 87.3% chr8 + 117091230 117491610 400381 browser details YourSeq 104 158 288 3000 90.0% chr6 + 63569266 63569396 131 browser details YourSeq 102 164 303 3000 88.1% chr15 - 82387452 82387589 138 browser details YourSeq 102 156 284 3000 90.7% chr1 + 158213137 158213270 134 browser details YourSeq 100 156 288 3000 88.5% chr1 - 35521786 35521916 131 browser details YourSeq 98 149 288 3000 90.1% chr10 - 47997314 47997453 140 browser details YourSeq 98 156 288 3000 87.7% chr3 + 72418477 72418607 131 browser details YourSeq 96 155 288 3000 89.5% chr4 - 32903006 32903159 154 browser details YourSeq 96 164 303 3000 89.5% chr6 + 14643340 14643479 140 browser details YourSeq 96 156 288 3000 89.6% chr3 + 79319621 79319751 131 browser details YourSeq 94 191 420 3000 84.1% chrX - 166412768 166413178 411 browser details YourSeq 94 149 288 3000 89.9% chr11 - 60315189 60315328 140 browser details YourSeq 92 157 288 3000 87.8% chr4 + 117836809 117836941 133 browser details YourSeq 92 185 388 3000 81.1% chr2 + 33681991 33682155 165 browser details YourSeq 90 164 287 3000 89.6% chr1 - 183321521 183321646 126 browser details YourSeq 89 185 297 3000 87.0% chr17 - 88074046 88074154 109 browser details YourSeq 89 155 288 3000 88.8% chr8 + 47374367 47374503 137 browser details YourSeq 89 167 288 3000 89.4% chr6 + 30039619 30039741 123 browser details YourSeq 88 156 288 3000 89.5% chr5 + 82584021 82584151 131

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr18 + 75053285 75056284 3000 browser details YourSeq 337 94 1678 3000 91.7% chr11 - 109519616 109840992 321377 browser details YourSeq 304 29 635 3000 90.9% chr5 - 147621841 147622728 888 browser details YourSeq 242 1428 1839 3000 78.9% chr4 + 136125696 136126101 406 browser details YourSeq 239 1440 1825 3000 83.6% chr4 + 137934718 137935065 348 browser details YourSeq 229 1440 1820 3000 87.6% chr1 - 90771641 90772021 381 browser details YourSeq 225 1442 2273 3000 85.2% chr4 + 45743307 45743986 680 browser details YourSeq 221 1446 1843 3000 81.4% chr5 + 147129031 147129425 395 browser details YourSeq 219 1464 1825 3000 83.9% chr17 - 8351955 8352298 344 browser details YourSeq 219 1465 1818 3000 82.1% chr4 + 41617123 41617474 352 browser details YourSeq 217 1423 1777 3000 83.9% chr5 - 53141467 53141807 341 browser details YourSeq 213 1440 1825 3000 83.1% chr7 + 79230537 79230917 381 browser details YourSeq 212 1440 1820 3000 85.1% chr16 + 92990566 92990944 379 browser details YourSeq 211 1440 1820 3000 85.6% chr9 - 65287104 65287470 367 browser details YourSeq 211 1449 1820 3000 82.8% chr10 + 75959312 75959684 373 browser details YourSeq 210 1441 1803 3000 83.8% chr18 - 76332603 76332954 352 browser details YourSeq 210 1435 1825 3000 85.4% chr10 - 21666551 21666943 393 browser details YourSeq 210 1441 1804 3000 82.2% chr5 + 131412501 131412862 362 browser details YourSeq 209 1153 1800 3000 83.4% chr1 - 36193396 36194020 625 browser details YourSeq 206 1523 1820 3000 84.9% chr5 - 149393254 149393555 302

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Dym dymeclin [ Mus musculus (house mouse) ] Gene ID: 69190, updated on 14-Aug-2019

Gene summary

Official Symbol Dym provided by MGI Official Full Name dymeclin provided by MGI Primary source MGI:MGI:1918480 See related Ensembl:ENSMUSG00000035765 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 1810041M12Rik; 4933427L07Rik; C030019K18Rik Expression Ubiquitous expression in testis adult (RPKM 9.1), cerebellum adult (RPKM 5.0) and 28 other tissues See more Orthologs human all

Genomic context

Location: 18; 18 E2-E3 See Dym in Genome Data Viewer

Exon count: 19

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 18 NC_000084.6 (75018699..75286966)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 18 NC_000084.5 (75178426..75446620)

Chromosome 18 - NC_000084.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Dym ENSMUSG00000035765

Description dymeclin [Source:MGI Symbol;Acc:MGI:1918480] Gene Synonyms 1810041M12Rik, 4933427L07Rik, C030019K18Rik Location : 75,018,781-75,286,964 forward strand. GRCm38:CM001011.2 About this gene This gene has 6 transcripts (splice variants), 197 orthologues, is a member of 1 Ensembl protein family and is associated with 25 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Dym-201 ENSMUST00000039608.8 2456 669aa ENSMUSP00000047054.7 Protein coding CCDS29346 Q8CHY3 TSL:1 GENCODE basic APPRIS P1

Dym-204 ENSMUST00000235692.1 3007 344aa ENSMUSP00000157969.1 Protein coding - A0A494BA72 GENCODE basic

Dym-205 ENSMUST00000236220.1 1571 263aa ENSMUSP00000157621.1 Protein coding - A0A494B9E2 GENCODE basic

Dym-203 ENSMUST00000235554.1 4307 No protein - Retained intron - - -

Dym-206 ENSMUST00000236840.1 556 No protein - lncRNA - - -

Dym-202 ENSMUST00000235545.1 515 No protein - lncRNA - - -

Page 6 of 8 https://www.alphaknockout.com

288.18 kb Forward strand 75.05Mb 75.10Mb 75.15Mb 75.20Mb 75.25Mb BC031181-201 >protein Dcoydmin-2g02 >lncRNA Dym-203 >retained intron (Comprehensive set...

BC031181-203 >protein coding Dym-206 >lncRNA

BC031181-204 >protein coding

BC031181-202 >retained intron

Dym-205 >protein coding

Dym-201 >protein coding

Dym-204 >protein coding

Gm27781-201 >rRNA

Contigs < AC125122.4 AC132385.5 > AC121511.12 > Genes < Gm8807-201processed pseudogene < 2010010A06Rik-203lncRNA (Comprehensive set...

< 2010010A06Rik-201transcribed processed pseudogene

< 2010010A06Rik-202lncRNA

Regulatory Build

75.05Mb 75.10Mb 75.15Mb 75.20Mb 75.25Mb Reverse strand 288.18 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript pseudogene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000039608

268.18 kb Forward strand

Dym-201 >protein coding

ENSMUSP00000047... Low complexity (Seg) Superfamily Armadillo-type fold

Pfam PF09742

PANTHER Dymeclin

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

stop gained missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 669

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8