http://www.alphaknockout.com/ Mouse Zyg11b Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Zyg11b conditional knockout mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Zyg11b ( NCBI Reference Sequence: NM_001033634 ; Ensembl: ENSMUSG00000034636 ) is located on mouse 4. 14 exons are identified , with the ATG start codon in exon 1 and the TGA stop codon in exon 14 (Transcript Zyg11b- 201: ENSMUST00000043616). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the mouse Zyg11b gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-272K18 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 3 starts from about 8.83% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 5645 bp, and the size of intron 3 for 3'-loxP site insertion: 5727 bp.

The size of effective cKO region: ~1570 bp. This strategy is designed based on genetic information in existing databases. Due to the complexity of biological processes, all risk of loxP insertion on gene transcription, RNA splicing and translation cannot be predicted at existing technological level.

Page 1 of 7 http://www.alphaknockout.com/

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 3 14 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Zyg11b Homology arm cKO region loxP site

Page 2 of 7 http://www.alphaknockout.com/

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7755bp) | A(25.2% 1954) | C(20.77% 1611) | G(22.04% 1709) | T(31.99% 2481)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 http://www.alphaknockout.com/

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 - 108266823 108269822 3000 browser details YourSeq 269 1 1783 3000 90.7% chr10 + 60020924 60262168 241245 browser details YourSeq 256 1 1783 3000 90.5% chr12 + 84860247 85006115 145869 browser details YourSeq 129 1 153 3000 93.4% chr18 - 84923595 84923751 157 browser details YourSeq 129 1607 1783 3000 86.0% chr17 - 47503598 47503748 151 browser details YourSeq 128 1 147 3000 94.6% chrX + 103339017 103339290 274 browser details YourSeq 127 1624 1783 3000 88.6% chr5 - 132788768 132788920 153 browser details YourSeq 127 1631 1783 3000 92.1% chr5 - 93163370 93163528 159 browser details YourSeq 126 1637 1783 3000 93.8% chr14 - 70756208 70756373 166 browser details YourSeq 126 1639 1783 3000 90.8% chr11 - 103949882 103950022 141 browser details YourSeq 126 1641 1785 3000 92.3% chr15 + 91166450 91166592 143 browser details YourSeq 124 1646 1785 3000 94.3% chr11 + 68478232 68478371 140 browser details YourSeq 123 1664 1858 3000 93.1% chr16 - 94337756 94338093 338 browser details YourSeq 123 1638 1783 3000 92.5% chr5 + 52856225 52856375 151 browser details YourSeq 123 1641 1783 3000 92.1% chr11 + 78570924 78571063 140 browser details YourSeq 122 1644 1783 3000 91.3% chr9 + 70550191 70550327 137 browser details YourSeq 122 1640 1783 3000 93.1% chr2 + 129811722 129811877 156 browser details YourSeq 122 1642 1783 3000 92.8% chr18 + 75406202 75406342 141 browser details YourSeq 122 1640 1783 3000 92.4% chr12 + 28917449 28917592 144 browser details YourSeq 121 1639 1783 3000 91.8% chr17 - 29498705 29498849 145

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr4 - 108262568 108265567 3000 browser details YourSeq 218 1984 2337 3000 87.4% chr14 - 91648319 91648577 259 browser details YourSeq 216 1984 2326 3000 86.4% chr3 - 115846682 115846939 258 browser details YourSeq 205 1997 2334 3000 92.2% chr11 - 10732938 10733276 339 browser details YourSeq 199 1984 2327 3000 93.9% chr5_JH584299_random + 313173 313761 589 browser details YourSeq 182 1984 2322 3000 85.0% chr4 + 22763988 22764275 288 browser details YourSeq 179 1984 2322 3000 83.6% chr14 + 27132757 27133007 251 browser details YourSeq 176 1984 2324 3000 86.5% chr1 - 125378428 125378627 200 browser details YourSeq 174 2020 2326 3000 91.8% chr5 + 94379672 94380290 619 browser details YourSeq 171 2000 2334 3000 85.8% chr15 + 93426720 93426950 231 browser details YourSeq 166 1988 2323 3000 84.0% chr17 + 32871386 32871587 202 browser details YourSeq 148 1994 2175 3000 89.0% chr3 + 126084596 126084758 163 browser details YourSeq 137 1993 2259 3000 91.5% chr3 - 44985679 44986056 378 browser details YourSeq 134 2036 2322 3000 84.0% chr1 - 130617595 130617828 234 browser details YourSeq 130 2004 2150 3000 91.7% chr16 - 41441863 41442006 144 browser details YourSeq 120 1995 2144 3000 84.9% chr14 + 27132757 27132896 140 browser details YourSeq 119 2281 2412 3000 95.5% chr4 - 108270502 108270636 135 browser details YourSeq 115 1995 2264 3000 95.3% chr19 - 5201036 5201392 357 browser details YourSeq 115 953 1127 3000 90.9% chr16 + 18933134 18933329 196 browser details YourSeq 115 2040 2334 3000 78.4% chr1 + 132396175 132396338 164

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 http://www.alphaknockout.com/ Gene and protein information: Zyg11b zyg-ll family member B, cell cycle regulator [ Mus musculus (house mouse) ] Gene ID: 414872, updated on 17-Dec-2020

Gene summary

Official Symbol Zyg11b provided by MGI Official Full Name zyg-ll family member B, cell cycle regulator provided by MGI Primary source MGI:MGI:2685277 See related Ensembl:ENSMUSG00000034636 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gm431; D4Mgi2; D4Mgi23; mKIAA1730; 1110046I03Rik; 2810482G21Rik Expression Ubiquitous expression in CNS E18 (RPKM 15.5), whole brain E14.5 (RPKM 12.1) and 26 other tissues See more Orthologs human all NEW Try the new Gene table Try the new Transcript table

Genomic context

Location: 4; 4 C7 See Zyg11b in Genome Data Viewer

Exon count: 14

Annotation release Status Assembly Chr Location

109 current GRCm39 (GCF_000001635.27) 4 NC_000070.7 (108084952..108158330, complement)

108.20200622 previous assembly GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (108227755..108301125, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (107900360..107973695, complement)

Chromosome 4 - NC_000070.7

Page 5 of 7 http://www.alphaknockout.com/

Transcript information: This gene has 2 transcripts

Gene: Zyg11b ENSMUSG00000034636

Description zyg-ll family member B, cell cycle regulator [Source:MGI Symbol;Acc:MGI:2685277] Gene Synonyms 1110046I03Rik, 2810482G21Rik, D4Mgi23, LOC242610 Location Chromosome 4: 108,086,921-108,158,293 reverse strand. GRCm39:CM000997.3 About this gene This gene has 2 transcripts (splice variants), 273 orthologues and 2 paralogues. Transcripts

UniProt Name Transcript ID bp Protein Translation ID Biotype CCDS Flags Match

Zyg11b- ENSMUST00000043616.7 8691 744aa ENSMUSP00000043844.7 Protein coding CCDS18448 Q3UFS0-1 TSL:1 201 GENCODE basic APPRIS P1

Zyg11b- ENSMUST00000130508.2 447 No - Processed - - TSL:3 202 protein transcript

91.37 kb Forward strand 108.08Mb 108.10Mb 108.12Mb 108.14Mb 108.16Mb Contigs BX293563.9 > AL627238.15 > (Comprehensive set from GENCODE M... < Zyg11b-201protein coding

< Zyg11b-202processed transcript < Gm12742-201processed pseudogene

Regulatory Build

108.08Mb 108.10Mb 108.12Mb 108.14Mb 108.16Mb Reverse strand 91.37 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Non-Protein Coding

pseudogene processed transcript

Page 6 of 7 http://www.alphaknockout.com/

Transcript: ENSMUST00000043616

< Zyg11b-201protein coding

Reverse strand 71.37 kb

ENSMUSP000000438... Low complexity (Seg) Superfamily SSF52047 Armadillo-type fold

PROSITE profiles Leucine-rich repeat PANTHER Protein zyg-11 homologue B

PTHR12904

Gene3D Leucine-rich repeat domain superfamily Armadillo-like helical

All sequence SNPs/in... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 744

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC, VectorBuilder.

Page 7 of 7