https://www.alphaknockout.com

Mouse Epb41l2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Epb41l2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Epb41l2 (NCBI Reference Sequence: NM_001199265 ; Ensembl: ENSMUSG00000019978 ) is located on Mouse 10. 20 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 19 (Transcript: ENSMUST00000053748). Exon 8 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Epb41l2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-329E17 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit normal glutamatergic synapse formation, AMPAR responses and long-term potentiation. Male mice homozygous for a knock-out allele exhibit normal fertility. Male mice homozygous for a gene trap allele on a mixed background are infertile.

Exon 8 starts from about 38.06% of the coding region. The knockout of Exon 8 will result in frameshift of the gene. The size of intron 7 for 5'-loxP site insertion: 1438 bp, and the size of intron 8 for 3'-loxP site insertion: 5845 bp. The size of effective cKO region: ~588 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 7 8 20 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Epb41l2 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7088bp) | A(23.7% 1680) | C(17.23% 1221) | T(30.76% 2180) | G(28.32% 2007)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 + 25469827 25472826 3000 browser details YourSeq 52 1316 1411 3000 94.9% chr18 - 20492767 20492862 96 browser details YourSeq 41 466 547 3000 95.7% chr2 - 17153436 17153550 115 browser details YourSeq 37 2197 2250 3000 78.3% chr14 + 103323497 103323546 50 browser details YourSeq 36 1317 1356 3000 97.5% chr11 - 43635917 43635966 50 browser details YourSeq 34 1324 1364 3000 94.8% chr12 - 66675997 66676037 41 browser details YourSeq 24 1320 1348 3000 92.9% chr10 - 121022150 121022182 33

Note: The 3000 bp section upstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 + 25473415 25476414 3000 browser details YourSeq 1451 486 2059 3000 96.5% chr10 + 25474364 25475941 1578 browser details YourSeq 1196 877 2142 3000 97.5% chr10 + 25474257 25475692 1436 browser details YourSeq 1145 588 1993 3000 96.1% chr10 + 25474632 25475977 1346 browser details YourSeq 1088 1371 2563 3000 97.0% chr10 + 25473951 25475645 1695 browser details YourSeq 931 593 1636 3000 97.8% chr10 + 25474267 25475918 1652 browser details YourSeq 877 1613 2563 3000 96.6% chr10 + 25473887 25475143 1257 browser details YourSeq 836 831 1943 3000 95.6% chr10 + 25474615 25475523 909 browser details YourSeq 829 1239 2533 3000 95.2% chr10 + 25474023 25475215 1193 browser details YourSeq 826 562 1559 3000 96.0% chr10 + 25474946 25475977 1032 browser details YourSeq 755 1067 1970 3000 95.3% chr10 + 25474549 25475354 806 browser details YourSeq 724 1367 2563 3000 95.8% chr10 + 25473849 25475075 1227 browser details YourSeq 712 593 1651 3000 94.6% chr10 + 25474807 25475669 863 browser details YourSeq 698 1127 2006 3000 97.1% chr10 + 25474013 25475224 1212 browser details YourSeq 677 1809 2563 3000 95.3% chr10 + 25473917 25474701 785 browser details YourSeq 657 1541 2533 3000 96.0% chr10 + 25473951 25474943 993 browser details YourSeq 169 1342 2461 3000 80.8% chr17 - 35662316 35662785 470 browser details YourSeq 156 1614 2529 3000 82.9% chr17 - 35662392 35662823 432 browser details YourSeq 130 1342 2129 3000 82.0% chr17 - 35662316 35662823 508 browser details YourSeq 123 18 274 3000 89.7% chr8 - 11339103 11339362 260

Note: The 3000 bp section downstream of Exon 8 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Epb41l2 erythrocyte membrane protein band 4.1 like 2 [ Mus musculus (house mouse) ] Gene ID: 13822, updated on 24-Oct-2019

Gene summary

Official Symbol Epb41l2 provided by MGI Official Full Name erythrocyte membrane protein band 4.1 like 2 provided by MGI Primary source MGI:MGI:103009 See related Ensembl:ENSMUSG00000019978 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 4.1G; NBL2; AW555191; Epb4.1l2; D10Ertd398e Expression Ubiquitous expression in subcutaneous fat pad adult (RPKM 17.3), testis adult (RPKM 16.5) and 26 other tissues See more Orthologs human all

Genomic context

Location: 10 A4; 10 12.26 cM See Epb41l2 in Genome Data Viewer

Exon count: 26

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (25359782..25523519)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (25128178..25243320)

Chromosome 10 - NC_000076.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 20 transcripts

Gene: Epb41l2 ENSMUSG00000019978

Description erythrocyte membrane protein band 4.1 like 2 [Source:MGI Symbol;Acc:MGI:103009] Gene Synonyms 4.1G, D10Ertd398e, Epb4.1l2, NBL2 Location Chromosome 10: 25,359,798-25,523,519 forward strand. GRCm38:CM001003.2 About this gene This gene has 20 transcripts (splice variants), 224 orthologues, 11 paralogues, is a member of 1 Ensembl protein family and is associated with 7 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Epb41l2- ENSMUST00000053748.15 4385 988aa ENSMUSP00000055122.8 Protein CCDS23754 O70318 TSL:1 201 coding GENCODE basic APPRIS P2

Epb41l2- ENSMUST00000092645.6 4280 988aa ENSMUSP00000090314.6 Protein CCDS23754 O70318 TSL:5 202 coding GENCODE basic APPRIS P2

Epb41l2- ENSMUST00000220290.1 4345 201aa ENSMUSP00000151707.1 Protein - A0A1W2P7I2 TSL:5 219 coding GENCODE basic

Epb41l2- ENSMUST00000218903.1 4115 918aa ENSMUSP00000151685.1 Protein - Q80UE5 TSL:1 207 coding GENCODE basic APPRIS ALT2

Epb41l2- ENSMUST00000217929.1 3621 794aa ENSMUSP00000151875.1 Protein - Q80UE4 TSL:1 204 coding GENCODE basic

Epb41l2- ENSMUST00000220121.1 3358 806aa ENSMUSP00000158664.1 Protein - Q8C928 CDS 5' incomplete 218 coding TSL:1

Epb41l2- ENSMUST00000219900.1 1766 513aa ENSMUSP00000151332.1 Protein - A0A1W2P6I5 CDS 3' incomplete 215 coding TSL:1

Epb41l2- ENSMUST00000219805.1 1609 513aa ENSMUSP00000151233.1 Protein - A0A1W2P6I5 CDS 3' incomplete 214 coding TSL:1

Epb41l2- ENSMUST00000219967.1 754 144aa ENSMUSP00000151632.1 Protein - A0A1W2P7I4 CDS 3' incomplete 217 coding TSL:3

Epb41l2- ENSMUST00000219166.1 623 113aa ENSMUSP00000151702.1 Protein - A0A1W2P7H7 CDS 3' incomplete 209 coding TSL:2

Epb41l2- ENSMUST00000217943.1 612 204aa ENSMUSP00000152003.1 Protein - A0A1W2P8C0 CDS 5' and 3' 205 coding incomplete TSL:1

Epb41l2- ENSMUST00000219224.1 563 95aa ENSMUSP00000151926.1 Protein - A0A1W2P896 CDS 3' incomplete 211 coding TSL:2

Epb41l2- ENSMUST00000219372.1 523 131aa ENSMUSP00000151258.1 Protein - A0A1W2P6H2 CDS 5' incomplete 212 coding TSL:2

Epb41l2- ENSMUST00000220335.1 2402 No - Retained - - TSL:NA 220 protein intron

Epb41l2- ENSMUST00000219201.1 1126 No - Retained - - TSL:2 210 protein intron

Epb41l2- ENSMUST00000219941.1 803 No - Retained - - TSL:3 216 protein intron

Epb41l2- ENSMUST00000218345.1 722 No - Retained - - TSL:3 206 protein intron

Epb41l2- ENSMUST00000219390.1 596 No - Retained - - TSL:2 213 protein intron

Epb41l2- ENSMUST00000219138.1 1489 No - lncRNA - - TSL:1 208 protein

Page 6 of 8 https://www.alphaknockout.com

Epb41l2- ENSMUST00000217844.1 518 No - lncRNA - - TSL:1 203 protein

183.72 kb Forward strand

25.35Mb 25.40Mb 25.45Mb 25.50Mb (Comprehensive set... Epb41l2-201 >protein coding

Epb41l2-207 >protein coding

Epb41l2-215 >protein coding Epb41l2-213 >retained intron

Epb41l2-217 >protein coding Epb41l2-208 >lncRNA Epb41l2-212 >protein coding

Epb41l2-202 >protein coding

Epb41l2-214 >protein coding Epb41l2-219 >protein coding

Epb41l2-211 >protein coding Epb41l2-220 >retained intron

Epb41l2-209 >protein coding Epb41l2-203 >lncRNA

Epb41l2-204 >protein coding

Epb41l2-218 >protein coding Epb41l2-205 >protein coding

Epb41l2-216 >retained intron

Epb41l2-206 >retained intron

Epb41l2-210 >retained intron

Contigs < AC153550.10 AC156274.8 > < AC153547.3 Genes < Gm47830-201processed pseudogene < Smlr1-202retained intron (Comprehensive set...

< Smlr1-201protein coding

Regulatory Build

25.35Mb 25.40Mb 25.45Mb 25.50Mb Reverse strand 183.72 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript pseudogene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000053748

163.72 kb Forward strand

Epb41l2-201 >protein coding

ENSMUSP00000055... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily FERM superfamily, second domain

Ubiquitin-like domain superfamily

SSF50729 SMART Band 4.1 domain FERM, C-terminal PH-like domain

FERM adjacent (FA) Prints Band 4.1 domain

Ezrin/radixin/moesin-like Pfam FERM, N-terminal FERM, C-terminal PH-like domain Band 4.1, C-terminal

FERM central domain FERM adjacent (FA)

SAB domain PROSITE profiles FERM domain PROSITE patterns FERM conserved site

FERM conserved site PIRSF PIRSF002304 PANTHER Band 4.1-like protein 2

PTHR23280 Gene3D FERM/acyl-CoA-binding protein superfamily

3.10.20.90 PH-like domain superfamily CDD cd17202 cd13184

FERM central domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 988

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8