https://www.alphaknockout.com

Mouse Abra Knockout Project (CRISPR/Cas9)

Objective: To create a Abra knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Abra (NCBI Reference Sequence: NM_175456 ; Ensembl: ENSMUSG00000042895 ) is located on Mouse 15. 2 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 2 (Transcript: ENSMUST00000054742). Exon 1~2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit impaired arteriogenesis following occlusion.

Exon 1 starts from about 0.09% of the coding region. Exon 1~2 covers 100.0% of the coding region. The size of effective KO region: ~3789 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2

Legends Exon of mouse Abra Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(33.25% 665) | C(18.6% 372) | T(30.2% 604) | G(17.95% 359)

Note: The 2000 bp section upstream of start codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(33.3% 666) | C(18.15% 363) | T(31.6% 632) | G(16.95% 339)

Note: The 2000 bp section downstream of stop codon is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr15 - 41869669 41871668 2000 browser details YourSeq 60 973 1070 2000 78.4% chr5 - 132015169 132015265 97 browser details YourSeq 55 973 1063 2000 80.3% chr7 + 97442250 97442340 91 browser details YourSeq 54 973 1072 2000 77.0% chr19 - 46155013 46155112 100 browser details YourSeq 54 975 1068 2000 83.8% chr1 - 106195574 106195666 93 browser details YourSeq 54 985 1073 2000 89.8% chr12 + 78774600 78774689 90 browser details YourSeq 54 973 1069 2000 81.1% chr11 + 101653829 101653922 94 browser details YourSeq 53 973 1061 2000 79.8% chr16 + 72701390 72701478 89 browser details YourSeq 53 973 1310 2000 59.7% chr13 + 95458104 95458243 140 browser details YourSeq 53 981 1068 2000 86.4% chr10 + 41359925 41360012 88 browser details YourSeq 51 986 1068 2000 80.8% chr13 + 43509964 43510046 83 browser details YourSeq 48 973 1072 2000 74.0% chr16 - 22712338 22712437 100 browser details YourSeq 48 976 1069 2000 79.0% chr11 + 77187465 77187557 93 browser details YourSeq 47 814 878 2000 90.6% chr9 - 79129251 79129314 64 browser details YourSeq 47 964 1028 2000 87.4% chr17 - 29954893 29954958 66 browser details YourSeq 46 976 1072 2000 78.8% chr17 + 49764874 49764971 98 browser details YourSeq 45 985 1072 2000 77.2% chr10 - 63345957 63346033 77 browser details YourSeq 44 1014 1086 2000 82.8% chr6 + 51455481 51455551 71 browser details YourSeq 44 973 1068 2000 73.0% chr18 + 55480441 55480536 96 browser details YourSeq 44 973 1028 2000 89.3% chr1 + 74356561 74356616 56

Note: The 2000 bp section upstream of start codon is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr15 - 41863878 41865877 2000 browser details YourSeq 125 1164 1304 2000 92.8% chr13 + 34985549 34985687 139 browser details YourSeq 124 1165 1304 2000 95.7% chr6 + 6438587 6438754 168 browser details YourSeq 124 1164 1306 2000 94.5% chr15 + 94643847 94644006 160 browser details YourSeq 121 1166 1304 2000 91.5% chr19 - 5908951 5909081 131 browser details YourSeq 121 1167 1303 2000 94.9% chrX + 114075650 114075795 146 browser details YourSeq 121 1159 1304 2000 88.7% chr13 + 91438992 91439133 142 browser details YourSeq 120 1175 1304 2000 96.2% chr11 - 46708843 46708972 130 browser details YourSeq 120 1168 1301 2000 95.5% chr6 + 51514782 51514920 139 browser details YourSeq 119 1168 1304 2000 90.3% chr16 - 22174101 22174233 133 browser details YourSeq 118 1166 1304 2000 93.4% chr19 - 4076890 4077032 143 browser details YourSeq 116 1168 1304 2000 92.7% chr2 + 32192140 32192277 138 browser details YourSeq 115 1164 1304 2000 93.6% chr2 + 132031706 132031844 139 browser details YourSeq 114 1176 1301 2000 96.1% chr16 - 33037614 33037743 130 browser details YourSeq 114 1183 1304 2000 96.8% chr13 - 98232213 98232334 122 browser details YourSeq 114 1171 1306 2000 93.8% chr19 + 40930556 40930697 142 browser details YourSeq 113 1168 1304 2000 92.5% chr13 - 36199737 36200153 417 browser details YourSeq 113 1164 1306 2000 88.4% chr1 + 184230049 184230181 133 browser details YourSeq 112 1176 1304 2000 93.8% chr2 + 117198333 117198462 130 browser details YourSeq 111 1168 1304 2000 91.1% chr4 - 12528998 12529129 132

Note: The 2000 bp section downstream of stop codon is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Abra actin-binding Rho activating protein [ Mus musculus (house mouse) ] Gene ID: 223513, updated on 8-Oct-2019

Gene summary

Official Symbol Abra provided by MGI Official Full Name actin-binding Rho activating protein provided by MGI Primary source MGI:MGI:2444891 See related Ensembl:ENSMUSG00000042895 Gene type protein coding RefSeq status PROVISIONAL Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as STARS; C130068O12Rik Expression Biased expression in heart adult (RPKM 5.2), mammary gland adult (RPKM 1.9) and 3 other tissues See more Orthologs all

Genomic context

Location: 15; 15 B3.1 See Abra in Genome Data Viewer Exon count: 2

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 15 NC_000081.6 (41865293..41869720, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 15 NC_000081.5 (41696839..41701266, complement)

Chromosome 15 - NC_000081.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 1 transcript

Gene: Abra ENSMUSG00000042895

Description actin-binding Rho activating protein [Source:MGI Symbol;Acc:MGI:2444891] Gene Synonyms C130068O12Rik, STARS Location Chromosome 15: 41,864,076-41,869,720 reverse strand. GRCm38:CM001008.2 About this gene This gene has 1 transcript (splice variant), 205 orthologues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Abra-201 ENSMUST00000054742.6 2979 375aa ENSMUSP00000051973.5 Protein coding CCDS27449 Q8BUZ1 TSL:1 GENCODE basic APPRIS P1

25.64 kb Forward strand 41.855Mb 41.860Mb 41.865Mb 41.870Mb 41.875Mb Oxr1-203 >protein coding (Comprehensive set...

Oxr1-204 >protein coding

Oxr1-207 >protein coding

Oxr1-214 >protein coding

Oxr1-206 >protein coding

Oxr1-202 >protein coding

Oxr1-201 >protein coding

Oxr1-205 >protein coding

Oxr1-213 >protein coding

Oxr1-211 >retained intron

Contigs < AC129212.4 Genes (Comprehensive set... < Abra-201protein coding

Regulatory Build

41.855Mb 41.860Mb 41.865Mb 41.870Mb 41.875Mb Reverse strand 25.64 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000054742

< Abra-201protein coding

Reverse strand 5.64 kb

ENSMUSP00000051... MobiDB lite SMART Costars domain Pfam Costars domain PANTHER Actin-binding Rho-activating protein

PTHR22739 Gene3D Costars domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 375

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8