https://www.alphaknockout.com

Mouse Gfi1b Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Gfi1b conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Gfi1b (NCBI Reference Sequence: NM_001160406 ; Ensembl: ENSMUSG00000026815 ) is located on Mouse 2. 7 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 7 (Transcript: ENSMUST00000164290). Exon 4~5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Gfi1b gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-118E9 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for disruption of this gene die as embryos by day E15. Mature adult red blood cells and megakaryocytes fail to develop.

Exon 4 starts from about 21.95% of the coding region. The knockout of Exon 4~5 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 806 bp, and the size of intron 5 for 3'-loxP site insertion: 1097 bp. The size of effective cKO region: ~1668 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Gfi1b Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8168bp) | A(22.86% 1867) | C(27.29% 2229) | T(24.44% 1996) | G(25.42% 2076)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 28614199 28617198 3000 browser details YourSeq 97 2625 2957 3000 94.6% chr15 + 83082462 83082831 370 browser details YourSeq 76 2868 2961 3000 89.2% chr10 - 57294917 57295009 93 browser details YourSeq 72 926 1039 3000 91.1% chr9 - 43094850 43094963 114 browser details YourSeq 72 2865 2952 3000 91.0% chr4 - 43405380 43405467 88 browser details YourSeq 72 2855 2957 3000 89.9% chr1 - 98116967 98117076 110 browser details YourSeq 71 2865 2957 3000 88.2% chr15 - 82171931 82172023 93 browser details YourSeq 71 2869 2962 3000 84.8% chr10 - 73418998 73419089 92 browser details YourSeq 71 2865 2957 3000 88.3% chr1 - 178476583 178476673 91 browser details YourSeq 68 2866 2957 3000 87.0% chr1 - 39580510 39580601 92 browser details YourSeq 67 953 1059 3000 80.8% chr17 + 36174642 36174738 97 browser details YourSeq 67 2873 2957 3000 89.7% chr13 + 101071370 101071452 83 browser details YourSeq 66 2873 2957 3000 90.4% chr16 - 17338811 17338895 85 browser details YourSeq 66 2868 2957 3000 86.6% chr11 + 98989104 98989191 88 browser details YourSeq 65 2865 2953 3000 87.1% chr18 - 6619708 6619793 86 browser details YourSeq 65 951 1043 3000 88.4% chr15 - 95464148 95464238 91 browser details YourSeq 65 2873 2951 3000 91.2% chr15 - 82251961 82252039 79 browser details YourSeq 65 2871 2967 3000 83.6% chr12 - 111159015 111159111 97 browser details YourSeq 65 2873 2957 3000 88.3% chr11 - 87773256 87773340 85 browser details YourSeq 65 2873 2957 3000 88.4% chr6 + 134361823 134361905 83

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 28609531 28612530 3000 browser details YourSeq 139 1921 2190 3000 90.6% chr7 + 127916612 127916986 375 browser details YourSeq 135 1923 2164 3000 88.5% chr16 - 76960378 76960639 262 browser details YourSeq 114 2030 2194 3000 85.5% chrX + 44862144 44862306 163 browser details YourSeq 112 2053 2202 3000 90.6% chr15 + 88817941 88818096 156 browser details YourSeq 111 2004 2190 3000 82.1% chr9 - 20991461 20991622 162 browser details YourSeq 111 1949 2172 3000 89.4% chr2 - 118931590 118931867 278 browser details YourSeq 111 2045 2194 3000 91.2% chr15 - 90175448 90175600 153 browser details YourSeq 111 1998 2189 3000 80.3% chr13 + 91830523 91830682 160 browser details YourSeq 110 2052 2190 3000 87.5% chr11 + 21897287 21897422 136 browser details YourSeq 106 2325 2466 3000 87.4% chr5 - 107717780 107717921 142 browser details YourSeq 106 2056 2190 3000 88.2% chr17 - 46649177 46649308 132 browser details YourSeq 102 1774 2190 3000 75.6% chr8 - 107098770 107098919 150 browser details YourSeq 101 2015 2175 3000 87.0% chr5 + 145169314 145169471 158 browser details YourSeq 101 2011 2184 3000 82.6% chr5 + 122185212 122185367 156 browser details YourSeq 101 2052 2190 3000 86.9% chr14 + 30580658 30580797 140 browser details YourSeq 101 2001 2224 3000 80.8% chr11 + 76009484 76009644 161 browser details YourSeq 100 2055 2190 3000 85.6% chr5 + 73341671 73341802 132 browser details YourSeq 100 1977 2193 3000 81.0% chr14 + 11822425 11822589 165 browser details YourSeq 99 2042 2190 3000 84.7% chr5 + 53611648 53611789 142

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Gfi1b growth factor independent 1B [ Mus musculus (house mouse) ] Gene ID: 14582, updated on 10-Oct-2019

Gene summary

Official Symbol Gfi1b provided by MGI Official Full Name growth factor independent 1B provided by MGI Primary source MGI:MGI:1276578 See related Ensembl:ENSMUSG00000026815 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gfi-1B Expression Biased expression in liver E14.5 (RPKM 51.1), liver E14 (RPKM 38.6) and 2 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 A3 See Gfi1b in Genome Data Viewer

Exon count: 7

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (28609450..28621982, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (28464970..28477502, complement)

Chromosome 2 - NC_000068.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Gfi1b ENSMUSG00000026815

Description growth factor independent 1B [Source:MGI Symbol;Acc:MGI:1276578] Gene Synonyms Gfi-1B Location Chromosome 2: 28,609,450-28,621,982 reverse strand. GRCm38:CM000995.2 About this gene This gene has 4 transcripts (splice variants), 193 orthologues, 26 paralogues, is a member of 1 Ensembl protein family and is associated with 14 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Gfi1b-204 ENSMUST00000164290.7 1822 363aa ENSMUSP00000128052.1 Protein coding CCDS50549 B7ZNH2 TSL:1 GENCODE basic APPRIS ALT2

Gfi1b-201 ENSMUST00000028156.7 1681 330aa ENSMUSP00000028156.7 Protein coding CCDS15843 O70237 TSL:1 GENCODE basic APPRIS P3

Gfi1b-202 ENSMUST00000145690.1 357 No protein - lncRNA - - TSL:2

Gfi1b-203 ENSMUST00000155686.1 238 No protein - lncRNA - - TSL:3

32.53 kb Forward strand

28.60Mb 28.61Mb 28.62Mb 28.63Mb Gm22824-201 >miRNA (Comprehensive set...

Contigs AL731851.13 >

Genes (Comprehensive set... < Gfi1b-204protein coding

< Gfi1b-201protein coding

< Gfi1b-203lncRNA

< Gfi1b-202lncRNA

Regulatory Build

28.60Mb 28.61Mb 28.62Mb 28.63Mb Reverse strand 32.53 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000164290

< Gfi1b-204protein coding

Reverse strand 12.53 kb

ENSMUSP00000128... MobiDB lite Low complexity (Seg) Superfamily Zinc finger C2H2 superfamily SMART Zinc finger C2H2-type Pfam Zinc finger C2H2-type PROSITE profiles Zinc finger C2H2-type PROSITE patterns Zinc finger C2H2-type PANTHER Zinc finger protein Gfi-1b

PTHR24393 Gene3D 3.30.160.60

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 363

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7