https://www.alphaknockout.com

Mouse Irf2bpl Knockout Project (CRISPR/Cas9)

Objective: To create a Irf2bpl knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Irf2bpl (NCBI Reference Sequence: NM_145836 ; Ensembl: ENSMUSG00000034168 ) is located on Mouse 12. 1 exon is identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 1 (Transcript: ENSMUST00000038422). Exon 1 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 1 starts from about 0.04% of the coding region. Exon 1 covers 100.0% of the coding region. The size of effective KO region: ~2323 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1

Legends Exon of mouse Irf2bpl Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.1% 422) | C(30.3% 606) | T(18.2% 364) | G(30.4% 608)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(27.8% 556) | C(20.25% 405) | T(29.75% 595) | G(22.2% 444)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 - 86883898 86885897 2000 browser details YourSeq 25 251 280 2000 85.2% chr1 - 150856894 150856921 28 browser details YourSeq 22 895 916 2000 100.0% chr8 - 79065516 79065537 22 browser details YourSeq 22 894 916 2000 100.0% chr6 - 46628649 46628672 24 browser details YourSeq 22 896 918 2000 100.0% chr1 - 46912734 46912757 24 browser details YourSeq 22 1611 1632 2000 100.0% chr7 + 137226601 137226622 22 browser details YourSeq 22 894 915 2000 100.0% chr17 + 5536751 5536772 22 browser details YourSeq 22 1849 1871 2000 100.0% chr1 + 19214225 19214251 27 browser details YourSeq 21 892 912 2000 100.0% chr9 - 106946131 106946151 21

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr12 - 86879573 86881572 2000 browser details YourSeq 752 1 870 2000 95.6% chr11 + 3629674 3630485 812 browser details YourSeq 674 1 872 2000 92.1% chr11 + 33647786 33648609 824 browser details YourSeq 335 1 345 2000 98.0% chr11 - 64759376 64759719 344 browser details YourSeq 40 1360 1460 2000 71.8% chr13 + 72220228 72220302 75 browser details YourSeq 39 1420 1481 2000 76.4% chr13 - 30920235 30920292 58 browser details YourSeq 39 1351 1465 2000 80.0% chr11 - 31658678 31658790 113 browser details YourSeq 36 493 570 2000 92.7% chr12 + 16595865 16596008 144 browser details YourSeq 35 397 562 2000 59.5% chr11 - 27902019 27902070 52 browser details YourSeq 34 1360 1465 2000 87.5% chr2 - 20708238 20708342 105 browser details YourSeq 33 1432 1474 2000 92.2% chr10 - 107145176 107145218 43 browser details YourSeq 33 1332 1370 2000 92.4% chr4 + 4432970 4433008 39 browser details YourSeq 30 1399 1444 2000 94.3% chr7 - 35964900 35964964 65 browser details YourSeq 30 1128 1157 2000 100.0% chr14 + 65035560 65035589 30 browser details YourSeq 30 1433 1476 2000 91.5% chr12 + 119460591 119460634 44 browser details YourSeq 30 542 574 2000 96.9% chr12 + 110116592 110116626 35 browser details YourSeq 29 539 571 2000 83.4% chr1 - 87968672 87968701 30 browser details YourSeq 28 1431 1464 2000 91.2% chr19 - 38077499 38077532 34 browser details YourSeq 27 443 478 2000 87.9% chr3 + 113295877 113295916 40 browser details YourSeq 25 1344 1370 2000 96.3% chr3 - 22032681 22032707 27

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Irf2bpl interferon regulatory factor 2 binding protein-like [ Mus musculus (house mouse) ] Gene ID: 238330, updated on 12-Aug-2019

Gene summary

Official Symbol Irf2bpl provided by MGI Official Full Name interferon regulatory factor 2 binding protein-like provided by MGI Primary source MGI:MGI:2442463 See related Ensembl:ENSMUSG00000034168 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Eap1; 6430527G18Rik Orthologs human all

Genomic context

Location: 12; 12 D2 See Irf2bpl in Genome Data Viewer

Exon count: 1

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 12 NC_000078.6 (86880703..86884814, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 12 NC_000078.5 (88221653..88225764, complement)

Chromosome 12 - NC_000078.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Irf2bpl ENSMUSG00000034168

Description interferon regulatory factor 2 binding protein-like [Source:MGI Symbol;Acc:MGI:2442463] Gene Synonyms 6430527G18Rik Location Chromosome 12: 86,880,701-86,884,798 reverse strand. GRCm38:CM001005.2 About this gene This gene has 1 transcript (splice variant), 140 orthologues, 5 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Irf2bpl-201 ENSMUST00000038422.7 4098 775aa ENSMUSP00000041070.6 Protein coding CCDS26068 Q8K3X4 TSL:NA GENCODE basic APPRIS P1

24.10 kb Forward strand 86.875Mb 86.880Mb 86.885Mb 86.890Mb Contigs AC110377.5 > (Comprehensive set... < Irf2bpl-201protein coding

Regulatory Build

86.875Mb 86.880Mb 86.885Mb 86.890Mb Reverse strand 24.10 kb

Regulation Legend CTCF Enhancer Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000038422

< Irf2bpl-201protein coding

Reverse strand 4.10 kb

ENSMUSP00000041... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily SSF57850 Pfam Interferon regulatory factor 2-binding protein 1 & 2, zinc finger PANTHER PTHR10816:SF14

PTHR10816 Gene3D 1.10.10.1580 CDD cd16511

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe deletion missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 775

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8