https://www.alphaknockout.com

Mouse Dbr1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Dbr1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Dbr1 (NCBI Reference Sequence: NM_031403 ; Ensembl: ENSMUSG00000032469 ) is located on Mouse 9. 8 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 8 (Transcript: ENSMUST00000066650). Exon 3~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Dbr1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-383F21 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a gene trap allele exhibit prenatal lethality. Mice heterozygous for this allele exhibit impaired class switch recombination in B cells.

Exon 3 starts from about 19.58% of the coding region. The knockout of Exon 3~4 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 1740 bp, and the size of intron 4 for 3'-loxP site insertion: 696 bp. The size of effective cKO region: ~1502 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 8 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Dbr1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8002bp) | A(24.42% 1954) | C(20.53% 1643) | T(33.15% 2653) | G(21.89% 1752)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr9 + 99575192 99578191 3000 browser details YourSeq 151 233 441 3000 88.8% chr9 - 106490834 106491016 183 browser details YourSeq 144 233 389 3000 96.2% chr12 - 55267800 55267958 159 browser details YourSeq 144 233 385 3000 97.4% chr10 + 7811993 7812147 155 browser details YourSeq 143 226 384 3000 95.6% chr12 - 86921354 86921784 431 browser details YourSeq 143 233 386 3000 96.8% chr5 + 121217841 121217996 156 browser details YourSeq 143 232 385 3000 96.8% chr5 + 114836254 114836409 156 browser details YourSeq 142 233 385 3000 96.8% chr10 - 24632631 24632785 155 browser details YourSeq 142 233 385 3000 96.8% chr6 + 135209578 135209732 155 browser details YourSeq 142 233 385 3000 96.8% chr5 + 123419637 123419791 155 browser details YourSeq 142 233 385 3000 96.8% chr5 + 114735214 114735368 155 browser details YourSeq 142 233 385 3000 96.8% chr5 + 13542820 13542974 155 browser details YourSeq 141 233 384 3000 96.8% chr8 - 17863004 17863157 154 browser details YourSeq 141 233 384 3000 96.8% chr10 - 63411629 63411782 154 browser details YourSeq 141 233 385 3000 96.8% chr2 + 26639608 26639764 157 browser details YourSeq 141 233 384 3000 96.8% chr1 + 51504549 51504702 154 browser details YourSeq 141 233 384 3000 96.8% chr1 + 7745364 7745517 154 browser details YourSeq 140 232 385 3000 96.2% chrX - 144253029 144253185 157 browser details YourSeq 140 234 384 3000 96.7% chr8 - 105215605 105215756 152 browser details YourSeq 140 233 385 3000 96.1% chr5 - 93064203 93064357 155

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr9 + 99579694 99582693 3000 browser details YourSeq 214 1476 1801 3000 88.4% chr2 - 165941820 165942144 325 browser details YourSeq 213 1470 1802 3000 90.5% chrX - 36679717 36680189 473 browser details YourSeq 200 1476 1807 3000 88.8% chr12 + 110731113 110731450 338 browser details YourSeq 182 117 1802 3000 91.8% chr10 - 119576123 119585714 9592 browser details YourSeq 159 101 301 3000 89.0% chrX - 142384174 142384369 196 browser details YourSeq 157 116 308 3000 89.6% chr1 + 156060818 156061009 192 browser details YourSeq 155 7 300 3000 90.2% chr16 - 11202073 11202371 299 browser details YourSeq 154 1515 1771 3000 87.8% chr11 + 88206040 88206616 577 browser details YourSeq 153 118 319 3000 89.7% chr17 - 29196483 29196700 218 browser details YourSeq 152 1486 1798 3000 83.1% chr10 + 81083877 81084125 249 browser details YourSeq 151 118 300 3000 92.2% chr11 + 77184549 77184735 187 browser details YourSeq 149 118 301 3000 91.2% chr8 - 119526326 119526511 186 browser details YourSeq 149 117 301 3000 90.3% chr17 - 25975287 25975471 185 browser details YourSeq 148 117 301 3000 90.3% chr8 - 27148488 27148672 185 browser details YourSeq 148 117 301 3000 90.3% chr2 - 129249146 129249331 186 browser details YourSeq 148 120 324 3000 88.9% chr12 - 12856690 12856909 220 browser details YourSeq 148 127 335 3000 87.9% chr17 + 83439219 83439426 208 browser details YourSeq 147 1542 1776 3000 89.3% chr15 + 99980621 99981230 610 browser details YourSeq 146 116 301 3000 88.3% chr8 - 124996759 124996941 183

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Dbr1 debranching RNA lariats 1 [ Mus musculus (house mouse) ] Gene ID: 83703, updated on 12-Aug-2019

Gene summary

Official Symbol Dbr1 provided by MGI Official Full Name debranching RNA lariats 1 provided by MGI Primary source MGI:MGI:1931520 See related Ensembl:ENSMUSG00000032469 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AW018415 Expression Ubiquitous expression in CNS E11.5 (RPKM 8.4), liver E14 (RPKM 7.1) and 28 other tissues See more Orthologs human all

Genomic context

Location: 9; 9 E3.3 See Dbr1 in Genome Data Viewer

Exon count: 8

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 9 NC_000075.6 (99575786..99585007)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 9 NC_000075.5 (99476218..99484762)

Chromosome 9 - NC_000075.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Dbr1 ENSMUSG00000032469

Description debranching RNA lariats 1 [Source:MGI Symbol;Acc:MGI:1931520] Location Chromosome 9: 99,575,799-99,584,501 forward strand. GRCm38:CM001002.2 About this gene This gene has 6 transcripts (splice variants), 202 orthologues, is a member of 1 Ensembl protein family and is associated with 7 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Dbr1- ENSMUST00000066650.11 2261 550aa ENSMUSP00000070991.5 Protein coding CCDS23435 Q923B1 TSL:1 201 GENCODE basic APPRIS P1

Dbr1- ENSMUST00000148987.7 796 231aa ENSMUSP00000115074.1 Protein coding - D3YXA2 CDS 3' incomplete 205 TSL:3

Dbr1- ENSMUST00000136884.1 521 151aa ENSMUSP00000114670.1 Protein coding - F6ZYX6 CDS 5' incomplete 202 TSL:2

Dbr1- ENSMUST00000138002.1 491 164aa ENSMUSP00000119924.1 Protein coding - F6S2R9 CDS 5' and 3' 203 incomplete TSL:3

Dbr1- ENSMUST00000156035.7 485 148aa ENSMUSP00000115978.1 Protein coding - F6UCV1 CDS 5' incomplete 206 TSL:5

Dbr1- ENSMUST00000139796.7 842 83aa ENSMUSP00000115203.1 Nonsense mediated - F6SDN8 CDS 5' incomplete 204 decay TSL:3

Page 6 of 8 https://www.alphaknockout.com

28.70 kb Forward strand

99.57Mb 99.58Mb 99.59Mb Dbr1-201 >protein coding (Comprehensive set...

Dbr1-205 >protein coding

Dbr1-206 >protein coding

Dbr1-203 >protein coding

Dbr1-204 >nonsense mediated decay

Dbr1-202 >protein coding

Contigs < AC156497.2

Genes < Armc8-201protein coding (Comprehensive set...

< Armc8-203protein coding

< Armc8-202protein coding

< Armc8-204retained intron

Regulatory Build

99.57Mb 99.58Mb 99.59Mb Reverse strand 28.70 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000066650

8.70 kb Forward strand

Dbr1-201 >protein coding

ENSMUSP00000070... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF56300 SMART Lariat debranching , C-terminal

Pfam Calcineurin-like phosphoesterase domain, ApaH type Lariat debranching enzyme, C-terminal

PANTHER PTHR12849 CDD Lariat debranching enzyme, N-terminal metallophosphatase domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe insertion inframe deletion missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 550

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8