https://www.alphaknockout.com

Mouse Appl2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Appl2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Appl2 (NCBI Reference Sequence: NM_145220 ; Ensembl: ENSMUSG00000020263 ) is located on Mouse 10. 21 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 21 (Transcript: ENSMUST00000020500). Exon 6~7 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Appl2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-230G15 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a null allele display altered red blood cell physiology. Mutant MEFs exhibit defects in HGF- induced Akt activation, migration, and invasion.

Exon 6 starts from about 18.83% of the coding region. The knockout of Exon 6~7 will result in frameshift of the gene. The size of intron 5 for 5'-loxP site insertion: 2865 bp, and the size of intron 7 for 3'-loxP site insertion: 684 bp. The size of effective cKO region: ~725 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 6 7 8 21 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Appl2 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7225bp) | A(23.65% 1709) | C(23.88% 1725) | T(26.19% 1892) | G(26.28% 1899)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 83622004 83625003 3000 browser details YourSeq 78 1307 1466 3000 86.8% chr13 + 98668859 98669018 160 browser details YourSeq 77 1274 1466 3000 91.4% chr4 - 129685450 129686015 566 browser details YourSeq 74 1278 1466 3000 93.1% chr13 + 48847812 48848086 275 browser details YourSeq 69 1307 1466 3000 88.6% chr3 - 121321697 121321857 161 browser details YourSeq 68 1303 1446 3000 91.5% chr3 - 87941424 87941570 147 browser details YourSeq 68 1307 1466 3000 84.9% chr1 + 163319371 163632544 313174 browser details YourSeq 66 1318 1465 3000 88.4% chr12 + 91236608 91236757 150 browser details YourSeq 65 1312 1466 3000 81.0% chr3 - 108954743 108954904 162 browser details YourSeq 64 1303 1468 3000 73.0% chr13 - 112951594 112951730 137 browser details YourSeq 63 1352 1461 3000 77.2% chr8 - 108901994 108902101 108 browser details YourSeq 63 2246 2451 3000 71.8% chr11 + 102812959 102813071 113 browser details YourSeq 61 1299 1450 3000 91.8% chr14 + 65141526 65141680 155 browser details YourSeq 60 1272 1450 3000 83.4% chr14 + 13950241 13950625 385 browser details YourSeq 59 1326 1466 3000 85.1% chr7 - 100456231 100456364 134 browser details YourSeq 59 2254 2352 3000 86.5% chr7 + 39462669 39463059 391 browser details YourSeq 58 1354 1475 3000 81.6% chr11 + 21116045 21116162 118 browser details YourSeq 56 1423 2319 3000 96.8% chr10 - 59050584 59352817 302234 browser details YourSeq 55 1311 1465 3000 88.8% chr3 - 41217915 41218071 157 browser details YourSeq 55 1306 1454 3000 83.4% chr2 - 134692253 134692399 147

Note: The 3000 bp section upstream of Exon 6 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 - 83618279 83621278 3000 browser details YourSeq 64 2432 2615 3000 94.5% chr13 - 81813857 81814075 219 browser details YourSeq 63 2430 2493 3000 100.0% chr1 + 152365339 152365482 144 browser details YourSeq 57 2430 2487 3000 100.0% chr7 - 41049556 41049643 88 browser details YourSeq 57 2418 2482 3000 93.9% chr7 - 31186721 31186785 65 browser details YourSeq 56 2430 2487 3000 98.3% chr6 + 75436705 75436762 58 browser details YourSeq 55 2431 2487 3000 98.3% chr6 - 124680115 124680171 57 browser details YourSeq 54 2425 2482 3000 91.0% chr9 - 85873696 85873750 55 browser details YourSeq 54 2430 2487 3000 96.6% chr15 + 43387620 43387677 58 browser details YourSeq 53 2430 2482 3000 100.0% chr9 - 43399840 43399892 53 browser details YourSeq 53 2430 2482 3000 100.0% chr6 - 87229191 87229243 53 browser details YourSeq 53 2455 2607 3000 96.5% chr6 - 78629370 78629755 386 browser details YourSeq 53 2430 2482 3000 100.0% chr5 - 125021656 125021708 53 browser details YourSeq 53 2430 2482 3000 100.0% chr2 - 92431196 92431248 53 browser details YourSeq 53 2430 2482 3000 100.0% chr2 - 28731511 28731563 53 browser details YourSeq 53 2430 2482 3000 100.0% chr11 - 52936330 52936382 53 browser details YourSeq 53 2433 2487 3000 98.2% chr8 + 115989057 115989111 55 browser details YourSeq 53 2430 2487 3000 98.3% chr19 + 53349426 53349499 74 browser details YourSeq 53 2430 2482 3000 100.0% chr12 + 16015490 16015542 53 browser details YourSeq 52 2430 2482 3000 100.0% chr8 - 26710941 26710995 55

Note: The 3000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Appl2 adaptor protein, phosphotyrosine interaction, PH domain and leucine zipper containing 2 [ Mus musculus (house mouse) ] Gene ID: 216190, updated on 12-Aug-2019

Gene summary

Official Symbol Appl2 provided by MGI Official Full Name adaptor protein, phosphotyrosine interaction, PH domain and leucine zipper containing 2 provided by MGI Primary source MGI:MGI:2384914 See related Ensembl:ENSMUSG00000020263 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Dip3b; Dip3 beta; DIP13 beta Expression Broad expression in cerebellum adult (RPKM 26.5), adrenal adult (RPKM 22.2) and 26 other tissues See more Orthologs human all

Genomic context

Location: 10; 10 C1 See Appl2 in Genome Data Viewer

Exon count: 25

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (83600033..83648877, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (83062778..83111409, complement)

Chromosome 10 - NC_000076.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 14 transcripts

Gene: Appl2 ENSMUSG00000020263

Description adaptor protein, phosphotyrosine interaction, PH domain and leucine zipper containing 2 [Source:MGI Symbol;Acc:MGI:2384914] Gene Synonyms Dip3b Location Chromosome 10: 83,600,033-83,648,738 reverse strand. GRCm38:CM001003.2 About this gene This gene has 14 transcripts (splice variants), 196 orthologues, 24 paralogues, is a member of 1 Ensembl protein family and is associated with 6 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Appl2- ENSMUST00000020500.13 3020 662aa ENSMUSP00000020500.7 Protein coding CCDS24078 Q3TVI6 TSL:1 201 Q8K3G9 GENCODE basic APPRIS P1

Appl2- ENSMUST00000146876.8 713 209aa ENSMUSP00000121336.2 Protein coding - D3Z340 CDS 3' 206 incomplete TSL:5

Appl2- ENSMUST00000177187.1 625 67aa ENSMUSP00000135157.1 Protein coding - H3BJX0 CDS 5' 214 incomplete TSL:5

Appl2- ENSMUST00000176294.1 380 95aa ENSMUSP00000135645.1 Protein coding - H3BL43 CDS 3' 212 incomplete TSL:3

Appl2- ENSMUST00000150685.7 774 117aa ENSMUSP00000115903.1 Nonsense mediated - D6RFR7 TSL:3 211 decay

Appl2- ENSMUST00000176675.7 773 42aa ENSMUSP00000135672.1 Nonsense mediated - H3BL67 CDS 5' 213 decay incomplete TSL:5

Appl2- ENSMUST00000147582.7 3377 No - Retained intron - - TSL:1 208 protein

Appl2- ENSMUST00000130285.8 1210 No - Retained intron - - TSL:3 203 protein

Appl2- ENSMUST00000150351.1 807 No - Retained intron - - TSL:2 210 protein

Appl2- ENSMUST00000148096.7 743 No - Retained intron - - TSL:3 209 protein

Appl2- ENSMUST00000127788.1 614 No - Retained intron - - TSL:2 202 protein

Appl2- ENSMUST00000141048.1 427 No - Retained intron - - TSL:5 205 protein

Appl2- ENSMUST00000133719.2 961 No - lncRNA - - TSL:5 204 protein

Appl2- ENSMUST00000147118.7 530 No - lncRNA - - TSL:5 207 protein

Page 6 of 8 https://www.alphaknockout.com

68.71 kb Forward strand 83.60Mb 83.61Mb 83.62Mb 83.63Mb 83.64Mb 83.65Mb Washc4-202 >nonsense mediated decay (Comprehensive set...

Washc4-201 >protein coding

Washc4-204 >retained intron

Contigs AC153508.2 >

Genes (Comprehensive set... < Appl2-201protein coding

< Appl2-214protein coding < Appl2-207lncRNA

< Appl2-213nonsense mediated decay < Appl2-206protein coding

< Appl2-205retained intron < Appl2-211nonsense mediated decay

< Appl2-203retained intron < Appl2-204lncRNA

< Appl2-209retained intron < Appl2-212protein coding

< Appl2-208retained intron

< Appl2-210retained intron

< Appl2-202retained intron

Regulatory Build

83.60Mb 83.61Mb 83.62Mb 83.63Mb 83.64Mb 83.65Mb Reverse strand 68.71 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000020500

< Appl2-201protein coding

Reverse strand 48.71 kb

ENSMUSP00000020... MobiDB lite Superfamily SSF50729

AH/BAR domain superfamily SMART Pleckstrin homology domain PTB/PI domain

Pfam PF16746 Pleckstrin homology domain PTB/PI domain

PROSITE profiles Pleckstrin homology domain PTB/PI domain

PANTHER PTHR46415

PTHR46415:SF1 Gene3D AH/BAR domain superfamily PH-like domain superfamily CDD cd07632 cd13247 cd13158

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 600 662

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8