https://www.alphaknockout.com

Mouse Mybpc1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Mybpc1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Mybpc1 (NCBI Reference Sequence: NM_001252372 ; Ensembl: ENSMUSG00000020061 ) is located on Mouse 10. 29 exons are identified, with the ATG start codon in exon 1 and the TAG stop codon in exon 28 (Transcript: ENSMUST00000119185). Exon 3~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Mybpc1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-408J20 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 1.83% of the coding region. The knockout of Exon 3~4 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 12130 bp, and the size of intron 4 for 3'-loxP site insertion: 863 bp. The size of effective cKO region: ~2510 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 29 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Mybpc1 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(9010bp) | A(29.56% 2663) | C(21.09% 1900) | T(27.24% 2454) | G(22.12% 1993)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 88573780 88576779 3000 browser details YourSeq 124 1212 1339 3000 99.3% chr18 - 66086989 66087162 174 browser details YourSeq 124 1211 1339 3000 98.5% chr12 - 77797398 77797588 191 browser details YourSeq 124 1212 1339 3000 99.3% chr1 + 73892446 73892591 146 browser details YourSeq 124 1211 1339 3000 98.5% chr1 + 73892427 73892583 157 browser details YourSeq 123 1212 1339 3000 98.5% chr16 - 83187996 83188154 159 browser details YourSeq 116 1212 1337 3000 96.8% chr15 - 92056167 92056324 158 browser details YourSeq 109 1214 1339 3000 99.2% chr5 + 44703225 44703392 168 browser details YourSeq 107 1212 1339 3000 89.2% chr12 + 7088399 7088509 111 browser details YourSeq 106 1226 1339 3000 99.1% chr12 - 33456838 33456965 128 browser details YourSeq 105 1230 1339 3000 99.1% chr14 - 72498798 72498959 162 browser details YourSeq 104 1211 1338 3000 98.2% chr12 + 60334230 60334488 259 browser details YourSeq 102 1140 1339 3000 88.6% chr15 - 40108207 40108373 167 browser details YourSeq 100 1237 1339 3000 100.0% chr16 - 46533747 46534267 521 browser details YourSeq 100 1151 1339 3000 91.0% chr9 + 71996711 71997120 410 browser details YourSeq 99 1212 1339 3000 90.5% chr11 + 18711249 18711368 120 browser details YourSeq 98 1230 1339 3000 91.1% chr16 - 51163761 51163861 101 browser details YourSeq 98 1127 1337 3000 98.1% chrX + 93530906 93531161 256 browser details YourSeq 98 1230 1339 3000 96.3% chr1 + 191214749 191214899 151 browser details YourSeq 97 1230 1339 3000 99.1% chr19 - 30251177 30251450 274

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 88568270 88571269 3000 browser details YourSeq 161 1752 2114 3000 96.1% chr1 - 172696828 172786518 89691 browser details YourSeq 105 1756 1893 3000 88.7% chr19 - 46117702 46117827 126 browser details YourSeq 104 1734 1914 3000 94.1% chr11 + 72109629 72109825 197 browser details YourSeq 104 1782 1921 3000 87.7% chr1 + 25756397 25756518 122 browser details YourSeq 101 1809 1936 3000 96.4% chr18 + 79927634 79927989 356 browser details YourSeq 94 1782 2029 3000 81.0% chr1 + 52079328 52079435 108 browser details YourSeq 89 1782 2029 3000 83.6% chr1 + 97663814 97664005 192 browser details YourSeq 83 2011 2144 3000 81.5% chr18 - 42663701 42663808 108 browser details YourSeq 83 1759 2025 3000 78.3% chr1 + 179327990 179328173 184 browser details YourSeq 81 1957 2118 3000 91.7% chr7 + 41485571 41485777 207 browser details YourSeq 81 1674 2109 3000 90.0% chr1 + 180285303 180285869 567 browser details YourSeq 80 1996 2104 3000 90.9% chr10 - 5006263 5006380 118 browser details YourSeq 77 2016 2120 3000 92.4% chr1 + 79523025 79523155 131 browser details YourSeq 75 1934 2119 3000 80.0% chr9 + 112646159 112646308 150 browser details YourSeq 75 2021 2119 3000 92.1% chr7 + 89920662 89920763 102 browser details YourSeq 74 2016 2120 3000 94.1% chrX - 148437353 148437480 128 browser details YourSeq 74 2032 2120 3000 95.6% chr1 + 187382785 187382902 118 browser details YourSeq 73 2017 2119 3000 87.5% chr1 + 27061059 27061173 115 browser details YourSeq 71 2030 2120 3000 85.6% chrX + 99661519 99661605 87

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Mybpc1 myosin binding protein C, slow-type [ Mus musculus (house mouse) ] Gene ID: 109272, updated on 10-Oct-2019

Gene summary

Official Symbol Mybpc1 provided by MGI Official Full Name myosin binding protein C, slow-type provided by MGI Primary source MGI:MGI:1336213 See related Ensembl:ENSMUSG00000020061 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 8030451F13Rik Expression Biased expression in limb E14.5 (RPKM 9.6), mammary gland adult (RPKM 6.9) and 11 other tissues See more Orthologs human all

Genomic context

Location: 10; 10 C1 See Mybpc1 in Genome Data Viewer

Exon count: 35

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (88518279..88605229, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (87981027..88067897, complement)

Chromosome 10 - NC_000076.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Mybpc1 ENSMUSG00000020061

Description myosin binding protein C, slow-type [Source:MGI Symbol;Acc:MGI:1336213] Gene Synonyms 8030451F13Rik, Slow-type C-protein Location Chromosome 10: 88,518,279-88,605,152 reverse strand. GRCm38:CM001003.2 About this gene This gene has 7 transcripts (splice variants), 201 orthologues, 12 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Mybpc1- ENSMUST00000121629.7 3922 1124aa ENSMUSP00000112615.1 Protein coding CCDS48666 Q6P6L5 TSL:1 202 GENCODE basic APPRIS P3

Mybpc1- ENSMUST00000119185.7 3752 1127aa ENSMUSP00000112699.1 Protein coding CCDS56739 D3YU50 TSL:1 201 GENCODE basic APPRIS ALT2

Mybpc1- ENSMUST00000238199.1 3650 1139aa ENSMUSP00000158844.1 Protein coding - - GENCODE basic 207 APPRIS ALT2

Mybpc1- ENSMUST00000156573.1 1876 626aa ENSMUSP00000119024.1 Protein coding - F6RQD1 CDS 5' and 3' incomplete 206 TSL:5

Mybpc1- ENSMUST00000153964.7 1392 362aa ENSMUSP00000122472.1 Protein coding - F7D574 CDS 5' incomplete 205 TSL:1

Mybpc1- ENSMUST00000148205.1 805 No - Retained intron - - TSL:2 204 protein

Mybpc1- ENSMUST00000124144.1 600 No - Retained intron - - TSL:1 203 protein

Page 6 of 8 https://www.alphaknockout.com

106.87 kb Forward strand

88.52Mb 88.54Mb 88.56Mb 88.58Mb 88.60Mb Gm48752-201 >TEC (Comprehensive set...

Contigs < AC165357.4 AC164567.5 > Genes (Comprehensive set... < Mybpc1-202protein coding

< Mybpc1-203retained intron

< Mybpc1-207protein coding

< Mybpc1-201protein coding

< Mybpc1-205protein coding

< Mybpc1-204retained intron

< Mybpc1-206protein coding

Regulatory Build

88.52Mb 88.54Mb 88.56Mb 88.58Mb 88.60Mb Reverse strand 106.87 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000119185

< Mybpc1-201protein coding

Reverse strand 86.69 kb

ENSMUSP00000112... MobiDB lite Low complexity (Seg) Superfamily Fibronectin type III superfamily

Immunoglobulin-like domain superfamily SMART Fibronectin type III

Immunoglobulin subtype 2

Immunoglobulin subtype Prints PR00014 Pfam Immunoglobulin I-set

MyBP-C, tri-helix bundle domain Fibronectin type III PROSITE profiles Immunoglobulin-like domain

Fibronectin type III PANTHER PTHR13817:SF27

PTHR13817 Gene3D Immunoglobulin-like fold CDD cd05894 Fibronectin type III

cd00096

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 900 1000 1127

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8