https://www.alphaknockout.com

Mouse Epb42 Knockout Project (CRISPR/Cas9)

Objective: To create a Epb42 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Epb42 (NCBI Reference Sequence: NM_013513 ; Ensembl: ENSMUSG00000023216 ) is located on Mouse 2. 13 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 13 (Transcript: ENSMUST00000102490). Exon 2~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygotes for a targeted null mutation exhibit erythrocytic abnormalities including mild spherocytosis, altered ion transport, and dehydration.

Exon 2 starts from about 0.53% of the coding region. Exon 2~5 covers 31.07% of the coding region. The size of effective KO region: ~5547 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 13

Legends Exon of mouse Epb42 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1985 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 1230 bp section downstream of Exon 5 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(1985bp) | A(28.72% 570) | C(20.3% 403) | T(25.34% 503) | G(25.64% 509)

Note: The 1985 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(1230bp) | A(28.46% 350) | C(23.25% 286) | T(23.58% 290) | G(24.72% 304)

Note: The 1230 bp section downstream of Exon 5 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1985 1 1985 1985 100.0% chr2 - 121034599 121036583 1985 browser details YourSeq 107 297 644 1985 84.7% chr9 - 103918676 103919020 345 browser details YourSeq 107 322 644 1985 77.5% chr13 + 91727519 91727804 286 browser details YourSeq 104 264 627 1985 90.0% chr6 - 85926240 85926637 398 browser details YourSeq 102 396 651 1985 84.4% chr18 + 10166780 10167049 270 browser details YourSeq 100 380 615 1985 86.7% chr13 - 107335305 107335549 245 browser details YourSeq 98 407 629 1985 87.7% chr1 + 179590699 179590951 253 browser details YourSeq 96 407 638 1985 85.8% chr2 + 70382565 70382826 262 browser details YourSeq 95 301 644 1985 86.9% chr7 - 114555601 114555962 362 browser details YourSeq 95 303 644 1985 89.4% chr9 + 80189433 80189776 344 browser details YourSeq 94 437 644 1985 89.5% chr5 - 148477055 148477573 519 browser details YourSeq 94 446 638 1985 86.7% chr17 - 75141858 75142066 209 browser details YourSeq 94 437 644 1985 90.7% chr16 - 23377270 23377614 345 browser details YourSeq 94 262 551 1985 91.4% chr11 + 88782882 88783180 299 browser details YourSeq 88 420 638 1985 89.2% chr12 - 98993269 98993505 237 browser details YourSeq 85 413 599 1985 89.8% chr5 + 126311488 126311676 189 browser details YourSeq 85 460 644 1985 90.6% chr17 + 35319924 35320282 359 browser details YourSeq 84 437 644 1985 92.0% chr19 - 4638969 4639201 233 browser details YourSeq 83 379 644 1985 86.0% chr15 + 40128726 40129022 297 browser details YourSeq 81 303 538 1985 81.9% chr19 - 18941297 18941513 217

Note: The 1985 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 1230 1 1230 1230 100.0% chr2 - 121027822 121029051 1230 browser details YourSeq 152 439 802 1230 88.4% chr4 - 106668720 106669087 368 browser details YourSeq 125 536 787 1230 85.4% chr16 + 43836060 43836334 275 browser details YourSeq 122 439 681 1230 88.2% chr10 + 86831118 86831364 247 browser details YourSeq 120 445 690 1230 88.1% chr7 - 65673317 65673567 251 browser details YourSeq 119 450 681 1230 85.3% chr8 - 43250195 43250431 237 browser details YourSeq 119 439 690 1230 87.4% chr12 - 52809512 52809765 254 browser details YourSeq 113 541 790 1230 89.1% chr8 - 119310700 119310963 264 browser details YourSeq 113 450 682 1230 87.1% chr17 + 85092268 85092501 234 browser details YourSeq 112 523 690 1230 89.6% chr15 - 32037235 32037405 171 browser details YourSeq 111 450 794 1230 88.3% chr7 - 127832301 127832652 352 browser details YourSeq 109 539 797 1230 86.7% chr2 + 118564086 118564342 257 browser details YourSeq 108 450 678 1230 85.2% chr11 - 93794865 93795095 231 browser details YourSeq 105 400 690 1230 89.5% chr4 + 149811459 149811749 291 browser details YourSeq 104 439 794 1230 87.7% chr1 - 64969085 64969442 358 browser details YourSeq 101 538 794 1230 92.5% chr8 - 118501610 118501911 302 browser details YourSeq 101 523 665 1230 86.8% chr19 - 24240249 24334012 93764 browser details YourSeq 101 450 792 1230 86.5% chr7 + 145288121 145288462 342 browser details YourSeq 101 457 638 1230 82.5% chr18 + 15028028 15028209 182 browser details YourSeq 100 447 792 1230 85.8% chr10 - 124914750 124915096 347

Note: The 1230 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Epb42 erythrocyte membrane protein band 4.2 [ Mus musculus (house mouse) ] Gene ID: 13828, updated on 12-Aug-2019

Gene summary

Official Symbol Epb42 provided by MGI Official Full Name erythrocyte membrane protein band 4.2 provided by MGI Primary source MGI:MGI:95402 See related Ensembl:ENSMUSG00000023216 Gene type protein coding RefSeq status REVIEWED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Epb4.2 Summary The protein encoded by this gene is the key component of a macromolecular complex involved in the structure of Expression erythrocytes. [provided by RefSeq, Aug 2015] Orthologs Biased expression in liver E14.5 (RPKM 57.9), liver E14 (RPKM 55.2) and 3 other tissues See more human all

Genomic context

Location: 2 E5; 2 60.37 cM See Epb42 in Genome Data Viewer Exon count: 14

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (121017891..121036877, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (120843953..120862491, complement)

Chromosome 2 - NC_000068.7

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Epb42 ENSMUSG00000023216

Description erythrocyte membrane protein band 4.2 [Source:MGI Symbol;Acc:MGI:95402] Gene Synonyms Epb4.2 Location Chromosome 2: 121,017,891-121,037,072 reverse strand. GRCm38:CM000995.2 About this gene This gene has 7 transcripts (splice variants), 131 orthologues, 8 paralogues, is a member of 1 Ensembl protein family and is associated with 13 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Epb42-202 ENSMUST00000102490.9 4115 691aa ENSMUSP00000099548.3 Protein coding CCDS16631 P49222 TSL:1 GENCODE basic APPRIS P1

Epb42-201 ENSMUST00000023987.5 2695 673aa ENSMUSP00000023987.5 Protein coding - Q3UV95 TSL:1 GENCODE basic

Epb42-205 ENSMUST00000145812.1 625 No protein - Retained intron - - TSL:3

Epb42-203 ENSMUST00000124703.7 2513 No protein - lncRNA - - TSL:1

Epb42-206 ENSMUST00000147444.7 1752 No protein - lncRNA - - TSL:1

Epb42-207 ENSMUST00000152217.1 680 No protein - lncRNA - - TSL:5

Epb42-204 ENSMUST00000128360.1 366 No protein - lncRNA - - TSL:5

Page 7 of 9 https://www.alphaknockout.com

39.18 kb Forward strand 121.01Mb 121.02Mb 121.03Mb 121.04Mb Ccndbp1-201 >protein coding (Comprehensive set...

Ccndbp1-202 >protein coding

Ccndbp1-203 >protein coding

Ccndbp1-206 >lncRNA

Ccndbp1-208 >protein coding

Ccndbp1-207 >retained intron

Ccndbp1-205 >retained intron

Ccndbp1-204 >lncRNA

Contigs AL844548.7 > Genes (Comprehensive set... < Epb42-202protein coding < Tgm5-201protein coding

< Epb42-206lncRNA < Epb42-201protein coding

< Epb42-203lncRNA < Epb42-204lncRNA

< Epb42-205retained intron < Epb42-207lncRNA

Regulatory Build

121.01Mb 121.02Mb 121.03Mb 121.04Mb Reverse strand 39.18 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000102490

< Epb42-202protein coding

Reverse strand 19.18 kb

ENSMUSP00000099... Low complexity (Seg) Superfamily Immunoglobulin E-set Papain-like cysteine peptidase superfamily Transglutaminase, C-terminal domain superfamily

SMART Transglutaminase-like Pfam Transglutaminase, N-terminal Transglutaminase-like Transglutaminase, C-terminal

PROSITE patterns Transglutaminase, active site

PIRSF Protein-glutamine gamma-glutamyltransferase, animal

PANTHER PTHR11590:SF44

PTHR11590 Gene3D Immunoglobulin-like fold

Transglutaminase-like superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 691

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9