https://www.alphaknockout.com

Mouse Baiap2 Knockout Project (CRISPR/Cas9)

Objective: To create a Baiap2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Baiap2 (NCBI Reference Sequence: NM_130862 ; Ensembl: ENSMUSG00000025372 ) is located on Mouse 11. 14 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 14 (Transcript: ENSMUST00000075180). Exon 2~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Homozygotes for a knock-out allele show mid to late gestation lethality, developmental delay, oligodactyly, subcutaneous edema, and severely impaired cardiac and placental development. Adult homozygotes fail to regulate synaptic plasticity and exhibit hippocampus-associated learning deficits.

Exon 2 starts from about 3.51% of the coding region. Exon 2~3 covers 10.41% of the coding region. The size of effective KO region: ~3142 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 14

Legends Exon of mouse Baiap2 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(17.9% 358) | C(26.5% 530) | T(29.35% 587) | G(26.25% 525)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(23.25% 465) | C(23.55% 471) | T(29.1% 582) | G(24.1% 482)

Note: The 2000 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr11 + 119955103 119957102 2000 browser details YourSeq 23 647 669 2000 100.0% chr2 + 59817904 59817926 23 browser details YourSeq 22 816 837 2000 100.0% chrY + 10680671 10680692 22 browser details YourSeq 21 756 776 2000 100.0% chr1 + 110002634 110002654 21 browser details YourSeq 20 1980 1999 2000 100.0% chr1 + 21485421 21485440 20

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr11 + 119960245 119962244 2000 browser details YourSeq 79 878 994 2000 85.1% chr10 - 60499446 60499561 116 browser details YourSeq 78 883 1008 2000 83.7% chr15 - 81545728 81545846 119 browser details YourSeq 76 883 993 2000 82.5% chr16 - 94263865 94263973 109 browser details YourSeq 76 880 1009 2000 88.8% chr14 + 50901964 50902094 131 browser details YourSeq 74 883 993 2000 84.9% chr10 - 20218429 20218538 110 browser details YourSeq 73 899 993 2000 89.5% chr11 - 50117356 50357172 239817 browser details YourSeq 72 883 993 2000 85.3% chr15 - 55026522 55026631 110 browser details YourSeq 72 880 993 2000 81.6% chr1 + 21301175 21301288 114 browser details YourSeq 71 883 993 2000 83.2% chr11 - 114898540 114898653 114 browser details YourSeq 71 879 1008 2000 85.9% chr11 - 94948123 94948247 125 browser details YourSeq 71 883 993 2000 82.0% chr16 + 34060170 34060280 111 browser details YourSeq 71 883 1013 2000 83.4% chr13 + 93317092 93317221 130 browser details YourSeq 71 883 1011 2000 81.0% chr11 + 46187357 46187483 127 browser details YourSeq 69 883 1009 2000 74.4% chr10 - 117820240 117820360 121 browser details YourSeq 69 880 1008 2000 83.2% chr15 + 8540595 8540718 124 browser details YourSeq 68 883 1007 2000 82.3% chr18 - 60758205 60758327 123 browser details YourSeq 68 883 1010 2000 89.6% chr13 - 62768133 62768261 129 browser details YourSeq 68 883 1008 2000 83.0% chr10 + 108471451 108471574 124 browser details YourSeq 67 885 1008 2000 85.3% chr1 + 87297359 87297481 123

Note: The 2000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Baiap2 brain-specific angiogenesis inhibitor 1-associated protein 2 [ Mus musculus (house mouse) ] Gene ID: 108100, updated on 12-Aug-2019

Gene summary

Official Symbol Baiap2 provided by MGI Official Full Name brain-specific angiogenesis inhibitor 1-associated protein 2 provided by MGI Primary source MGI:MGI:2137336 See related Ensembl:ENSMUSG00000025372 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as IRSp53; R75030 Expression Ubiquitous expression in frontal lobe adult (RPKM 31.0), cortex adult (RPKM 30.6) and 27 other tissues See more Orthologs human all

Genomic context

Location: 11; 11 E2 See Baiap2 in Genome Data Viewer Exon count: 16

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (119939916..120006782)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (119804406..119868096)

Chromosome 11 - NC_000077.6

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Baiap2 ENSMUSG00000025372

Description brain-specific angiogenesis inhibitor 1-associated protein 2 [Source:MGI Symbol;Acc:MGI:2137336] Gene Synonyms IRSp53 Location Chromosome 11: 119,942,763-120,006,782 forward strand. GRCm38:CM001004.2 About this gene This gene has 9 transcripts (splice variants), 247 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 11 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Baiap2- ENSMUST00000075180.11 3518 522aa ENSMUSP00000074674.5 Protein CCDS25724 Q3UKP6 TSL:1 202 coding Q8BKX1 GENCODE basic APPRIS ALT1

Baiap2- ENSMUST00000103021.9 3398 482aa ENSMUSP00000099310.3 Protein CCDS25723 Q8BKX1 TSL:1 203 coding GENCODE basic

Baiap2- ENSMUST00000026436.9 2394 535aa ENSMUSP00000026436.3 Protein CCDS25722 Q8BKX1 TSL:1 201 coding GENCODE basic APPRIS P4

Baiap2- ENSMUST00000106231.7 4005 513aa ENSMUSP00000101838.1 Protein - Q8BKX1 TSL:1 204 coding GENCODE basic

Baiap2- ENSMUST00000106233.1 2442 521aa ENSMUSP00000101840.1 Protein - B1AZ46 TSL:5 205 coding GENCODE basic APPRIS ALT1

Baiap2- ENSMUST00000152523.7 585 No - lncRNA - - TSL:2 209 protein

Baiap2- ENSMUST00000146566.1 433 No - lncRNA - - TSL:3 207 protein

Baiap2- ENSMUST00000131580.1 360 No - lncRNA - - TSL:5 206 protein

Baiap2- ENSMUST00000146960.1 276 No - lncRNA - - TSL:5 208 protein

Page 7 of 9 https://www.alphaknockout.com

84.02 kb Forward strand

119.94Mb 119.96Mb 119.98Mb 120.00Mb (Comprehensive set... Gm11766-201 >lncRNA Baiap2-208 >lncRNA Baiap2-206 >lncRNA

Baiap2-205 >protein coding

Baiap2-203 >protein coding Mir3065-201 >miRNA

Baiap2-202 >protein coding

Baiap2-204 >protein coding

Baiap2-201 >protein coding

Baiap2-209 >lncRNA

Baiap2-207 >lncRNA

Contigs AL929234.9 > AL953913.5 > Genes < Gm11767-201lncRNA < Aatk-201protein coding (Comprehensive set...

< Aatk-202protein coding

< Aatk-203protein coding

< Aatk-209lncRNA

< Aatk-208lncRNA

< Mir338-201miRNA

< Aatk-207lncRNA

< Aatk-205lncRNA

< Aatk-204lncRNA

< Aatk-206lncRNA

Regulatory Build

119.94Mb 119.96Mb 119.98Mb 120.00Mb Reverse strand 84.02 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000075180

59.86 kb Forward strand

Baiap2-202 >protein coding

ENSMUSP00000074... MobiDB lite Low complexity (Seg) Superfamily AH/BAR domain superfamily SH3-like domain superfamily

SMART SH3 domain Pfam IMD/I-BAR domain SH3 domain

PROSITE profiles IMD/I-BAR domain SH3 domain

PANTHER I-BAR domain containing protein IRSp53

I-BAR domain containing protein IRSp53/IRTKS/Pinkbar Gene3D AH/BAR domain superfamily 2.30.30.40

CDD cd07646 IRSp53, SH3 domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend synonymous variant

Scale bar 0 60 120 180 240 300 360 420 522

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9