https://www.alphaknockout.com

Mouse Bdp1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Bdp1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Bdp1 (NCBI Reference Sequence: NM_001081061 ; Ensembl: ENSMUSG00000049658 ) is located on Mouse 13. 39 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 39 (Transcript: ENSMUST00000038104). Exon 5~6 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Bdp1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-232K9 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 8.92% of the coding region. The knockout of Exon 5~6 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 3992 bp, and the size of intron 6 for 3'-loxP site insertion: 2746 bp. The size of effective cKO region: ~1852 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 5 6 39 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Bdp1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8352bp) | A(28.1% 2347) | C(18.68% 1560) | T(32.24% 2693) | G(20.98% 1752)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr13 - 100093796 100096795 3000 browser details YourSeq 358 954 1915 3000 91.6% chr16 - 30713521 30714133 613 browser details YourSeq 350 1320 1915 3000 93.1% chr2 - 172639689 172640263 575 browser details YourSeq 348 1374 1921 3000 91.9% chr13 + 59723244 59723649 406 browser details YourSeq 347 1372 1919 3000 91.7% chr14 - 24417484 24417888 405 browser details YourSeq 347 1374 1915 3000 92.4% chrX + 162602716 162603115 400 browser details YourSeq 345 1378 1913 3000 92.8% chr13 + 52873300 52873692 393 browser details YourSeq 343 1377 2019 3000 90.7% chr13 + 44574395 44574817 423 browser details YourSeq 342 1377 1915 3000 92.0% chr4 + 70210711 70211107 397 browser details YourSeq 341 1377 1916 3000 91.7% chr8 - 84286569 84286964 396 browser details YourSeq 341 1378 1913 3000 92.2% chr17 - 87756389 87756782 394 browser details YourSeq 341 1378 1915 3000 92.3% chr1 - 177588021 177588416 396 browser details YourSeq 341 1381 1963 3000 91.1% chr4 + 72550056 72550483 428 browser details YourSeq 341 1373 1919 3000 91.1% chr19 + 27439357 27439762 406 browser details YourSeq 340 1377 1916 3000 91.7% chr6 - 39814714 39815111 398 browser details YourSeq 340 1378 1916 3000 91.7% chr1 - 178380366 178380761 396 browser details YourSeq 340 1377 1922 3000 91.5% chr2 + 175648195 175648599 405 browser details YourSeq 340 1377 1917 3000 91.5% chr15 + 6243255 6243650 396 browser details YourSeq 340 1378 1915 3000 91.9% chr14 + 72891471 72891865 395 browser details YourSeq 339 1378 1915 3000 91.9% chr14 + 19664633 19665027 395

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr13 - 100088944 100091943 3000 browser details YourSeq 187 1127 1335 3000 95.3% chr1 + 151657724 151658248 525 browser details YourSeq 176 1107 1331 3000 89.9% chr4 - 109223038 109223257 220 browser details YourSeq 175 1125 1328 3000 94.9% chr11 + 73182666 73182881 216 browser details YourSeq 170 1125 1333 3000 89.8% chr9 + 56742470 56742671 202 browser details YourSeq 169 1126 1325 3000 93.8% chr11 - 102451731 102451931 201 browser details YourSeq 166 1112 1324 3000 87.8% chr2 + 132608661 132608863 203 browser details YourSeq 165 1125 1326 3000 94.2% chr8 + 106715720 106715937 218 browser details YourSeq 164 1111 1317 3000 91.0% chr12 - 80281906 80282108 203 browser details YourSeq 164 1126 1326 3000 92.1% chr1 - 133022356 133022555 200 browser details YourSeq 164 1125 1330 3000 88.6% chr5 + 65689732 65689929 198 browser details YourSeq 164 1125 1329 3000 90.1% chr4 + 55324667 55324865 199 browser details YourSeq 163 1125 1333 3000 88.4% chr9 - 99443440 99443643 204 browser details YourSeq 163 1125 1324 3000 93.6% chr2 - 168671472 168671671 200 browser details YourSeq 163 1125 1333 3000 88.3% chr12 + 76258087 76258280 194 browser details YourSeq 162 1124 1316 3000 94.1% chr5 - 147296548 147296943 396 browser details YourSeq 162 1059 1316 3000 91.3% chr11 + 60571536 60572126 591 browser details YourSeq 162 795 1322 3000 82.1% chr1 + 8643334 8643579 246 browser details YourSeq 161 1125 1318 3000 90.7% chr19 - 5821166 5821352 187 browser details YourSeq 161 1126 1317 3000 92.2% chrX + 161019245 161019438 194

Note: The 3000 bp section downstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Bdp1 B double prime 1, subunit of RNA polymerase III transcription initiation factor IIIB [ Mus musculus (house mouse) ] Gene ID: 544971, updated on 10-Oct-2019

Gene summary

Official Symbol Bdp1 provided by MGI Official Full Name B double prime 1, subunit of RNA polymerase III transcription initiation factor IIIBprovided by MGI Primary source MGI:MGI:1347077 See related Ensembl:ENSMUSG00000049658 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as TFC5; Tfnr; TAF3B1; AI662220; AW049390; AW456745; TFIIIB90; TFIIIB150; mKIAA1241; B130055N23Rik; Expression G630013P12Rik Orthologs Ubiquitous expression in CNS E11.5 (RPKM 3.1), liver E14 (RPKM 2.4) and 26 other tissues See more human all

Genomic context

Location: 13 D1; 13 52.92 cM See Bdp1 in Genome Data Viewer

Exon count: 38

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 13 NC_000079.6 (100017994..100104090, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 13 NC_000079.5 (100787949..100874025, complement)

Chromosome 13 - NC_000079.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Bdp1 ENSMUSG00000049658

Description B double prime 1, subunit of RNA polymerase III transcription initiation factor IIIB [Source:MGI Symbol;AccM: GI:1347077] Gene Synonyms B130055N23Rik, G630013P12Rik, TAF3B1, TFC5, TFIIIB150, TFIIIB90, Tfnr Location Chromosome 13: 100,017,994-100,104,070 reverse strand. GRCm38:CM001006.2 About this gene This gene has 7 transcripts (splice variants), 192 orthologues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Bdp1- ENSMUST00000038104.11 9921 2467aa ENSMUSP00000038321.5 Protein coding CCDS49343 Q571C7 TSL:5 201 GENCODE basic APPRIS P2

Bdp1- ENSMUST00000109379.8 7567 2471aa ENSMUSP00000105005.2 Protein coding - Q571C7 TSL:5 204 GENCODE basic APPRIS ALT2

Bdp1- ENSMUST00000099262.9 5352 325aa ENSMUSP00000096868.3 Nonsense mediated - F8VQC2 CDS 5' 202 decay incomplete TSL:1

Bdp1- ENSMUST00000163097.1 631 128aa ENSMUSP00000126435.1 Nonsense mediated - E9PYY8 TSL:3 205 decay

Bdp1- ENSMUST00000167815.1 809 No - Retained intron - - TSL:2 207 protein

Bdp1- ENSMUST00000105697.2 636 No - Retained intron - - TSL:2 203 protein

Bdp1- ENSMUST00000167203.1 628 No - lncRNA - - TSL:5 206 protein

Page 6 of 8 https://www.alphaknockout.com

106.08 kb Forward strand

100.02Mb 100.04Mb 100.06Mb 100.08Mb 100.10Mb Serf1-201 >protein coding (Comprehensive set...

Serf1-203 >protein coding

Serf1-202 >protein coding

Serf1-204 >protein coding

Serf1-205 >protein coding

Contigs CT025569.14 > CT030167.8 > Genes < Mccc2-201protein coding< Bdp1-204protein coding (Comprehensive set...

< Mccc2-203nonsense mediated decay < Bdp1-203retained intron < Bdp1-207retained intron < Bdp1-206lncRNA < BC001981-202lncRNA

< Mccc2-204lncRNA < Bdp1-205nonsense mediated decay

< Bdp1-201protein coding

< Bdp1-202nonsense mediated decay < BC001981-201lncRNA

Regulatory Build

100.02Mb 100.04Mb 100.06Mb 100.08Mb 100.10Mb Reverse strand 106.08 kb

Regulation Legend CTCF Enhancer Open Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000038104

< Bdp1-201protein coding

Reverse strand 86.08 kb

ENSMUSP00000038... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Homeobox-like domain superfamily SMART SANT/Myb domain Pfam Transcription factor TFIIIB component B'', Myb domain PANTHER PTHR22929

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe insertion inframe deletion missense variant synonymous variant

Scale bar 0 400 800 1200 1600 2000 2467

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8