https://www.alphaknockout.com

Mouse Bcas1 Knockout Project (CRISPR/Cas9)

Objective: To create a Bcas1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Bcas1 (NCBI Reference Sequence: NM_029815 ; Ensembl: ENSMUSG00000013523 ) is located on Mouse 2. 13 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 13 (Transcript: ENSMUST00000013667). Exon 4 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit hypomyelination, decreased prepulse inhibition suggestive of schizophrenia-like symptoms, a tendency toward reduced anxiety-like behaviors, and upregulation of inflammation-related in the brain.

Exon 4 starts from about 8.0% of the coding region. Exon 4 covers 30.28% of the coding region. The size of effective KO region: ~575 bp. The KO region does not have any other known gene.

Page 1 of 9 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 4 13

Legends Exon of mouse Bcas1 Knockout region

Page 2 of 9 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 4 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 9 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.3% 506) | C(23.25% 465) | T(25.75% 515) | G(25.7% 514)

Note: The 2000 bp section upstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(28.0% 560) | C(22.4% 448) | T(25.15% 503) | G(24.45% 489)

Note: The 2000 bp section downstream of Exon 4 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 9 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 170406849 170408848 2000 browser details YourSeq 68 1336 1422 2000 91.4% chr11 + 104163159 104163371 213 browser details YourSeq 68 1335 1418 2000 91.7% chr10 + 128674494 128937562 263069 browser details YourSeq 67 1287 1430 2000 89.9% chr1 - 156435586 156435760 175 browser details YourSeq 64 1361 1430 2000 97.2% chr11 - 121361086 121361356 271 browser details YourSeq 62 1361 1439 2000 87.1% chrX - 94917267 94917344 78 browser details YourSeq 61 1272 1444 2000 83.6% chr11 - 83524222 83524393 172 browser details YourSeq 60 1336 1421 2000 84.9% chr10 - 77043649 77043734 86 browser details YourSeq 58 1287 1421 2000 91.5% chr16 - 21076132 21076554 423 browser details YourSeq 58 1274 1422 2000 95.4% chr11 - 119265306 119265632 327 browser details YourSeq 57 1335 1420 2000 83.2% chrX - 42084959 42085042 84 browser details YourSeq 57 1361 1434 2000 84.8% chr2 - 167401040 167401111 72 browser details YourSeq 56 1361 1433 2000 87.0% chr10 - 24673856 24673927 72 browser details YourSeq 56 1332 1422 2000 73.5% chr11 + 76071005 76071087 83 browser details YourSeq 54 1354 1425 2000 88.6% chr12 + 80996663 80996740 78 browser details YourSeq 53 1359 1422 2000 88.6% chr11 - 32062526 32062587 62 browser details YourSeq 53 1361 1431 2000 88.5% chr1 - 127670073 127670143 71 browser details YourSeq 53 1354 1417 2000 92.2% chr11 + 115722266 115722337 72 browser details YourSeq 53 1354 1430 2000 88.5% chr1 + 57405213 57405296 84 browser details YourSeq 52 1354 1430 2000 77.3% chr12 - 83590458 83590526 69

Note: The 2000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr2 - 170404274 170406273 2000 browser details YourSeq 72 406 552 2000 84.8% chr5 - 129998672 130013898 15227 browser details YourSeq 68 288 579 2000 92.5% chr7 - 118240099 118240518 420 browser details YourSeq 55 472 545 2000 87.9% chr7 + 121789584 121789666 83 browser details YourSeq 54 396 543 2000 93.6% chr1 + 179938801 179938951 151 browser details YourSeq 50 411 544 2000 94.8% chr7 - 116135318 116135459 142 browser details YourSeq 50 406 547 2000 89.1% chr1 - 128682425 128682568 144 browser details YourSeq 49 208 555 2000 75.0% chr6 + 120588126 120588458 333 browser details YourSeq 45 470 542 2000 94.3% chr1 - 91215701 91215780 80 browser details YourSeq 44 389 529 2000 92.4% chr2 - 119874086 119874228 143 browser details YourSeq 42 470 529 2000 85.0% chr4 - 108916404 108916463 60 browser details YourSeq 42 407 531 2000 91.4% chr11 - 118074613 118074736 124 browser details YourSeq 42 511 582 2000 85.8% chr13 + 18460931 18461000 70 browser details YourSeq 41 389 531 2000 87.3% chr10 + 122594115 122594258 144 browser details YourSeq 38 512 552 2000 97.6% chr4 - 53993367 53993408 42 browser details YourSeq 38 509 553 2000 95.3% chr7 + 98577279 98577330 52 browser details YourSeq 38 1070 1107 2000 100.0% chr11 + 65759334 65759371 38 browser details YourSeq 36 406 552 2000 97.4% chrX + 11544176 11544323 148 browser details YourSeq 35 473 531 2000 97.3% chr4 - 143011584 143011646 63 browser details YourSeq 35 466 520 2000 81.9% chr14 - 71022353 71022407 55

Note: The 2000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 9 https://www.alphaknockout.com

Gene and information: Bcas1 breast carcinoma amplified sequence 1 [ Mus musculus (house mouse) ] Gene ID: 76960, updated on 17-Sep-2019

Gene summary

Official Symbol Bcas1 provided by MGI Official Full Name breast carcinoma amplified sequence 1 provided by MGI Primary source MGI:MGI:1924210 See related Ensembl:ENSMUSG00000013523 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as NABC1; AI841227; 2210416M21Rik; 9030223A09Rik Expression Biased expression in cerebellum adult (RPKM 42.9), cortex adult (RPKM 25.7) and 6 other tissuesS ee more Orthologs human all

Genomic context

Location: 2; 2 H3 See Bcas1 in Genome Data Viewer Exon count: 14

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (170343973..170427911, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (170172491..170253345, complement)

Chromosome 2 - NC_000068.7

Page 6 of 9 https://www.alphaknockout.com

Transcript information: This gene has 9 transcripts

Gene: Bcas1 ENSMUSG00000013523

Description breast carcinoma amplified sequence 1 [Source:MGI Symbol;Acc:MGI:1924210] Gene Synonyms 2210416M21Rik, 9030223A09Rik, NABC1 Location Chromosome 2: 170,346,991-170,427,845 reverse strand. GRCm38:CM000995.2 About this gene This gene has 9 transcripts (splice variants), 129 orthologues, is a member of 1 Ensembl protein family and is associated with 4 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Bcas1-203 ENSMUST00000109152.8 2952 587aa ENSMUSP00000104780.2 Protein coding CCDS50806 E9Q8Q5 TSL:1 GENCODE basic APPRIS ALT2

Bcas1-201 ENSMUST00000013667.2 2916 633aa ENSMUSP00000013667.2 Protein coding CCDS17121 Q80YN3 TSL:1 GENCODE basic APPRIS P3

Bcas1-202 ENSMUST00000068137.10 2769 577aa ENSMUSP00000069437.4 Protein coding - A2AVX1 TSL:5 GENCODE basic APPRIS ALT2

Bcas1-208 ENSMUST00000154650.7 2222 379aa ENSMUSP00000122298.1 Protein coding - F7BNZ5 CDS 5' incomplete TSL:1

Bcas1-205 ENSMUST00000145920.1 1014 No protein - lncRNA - - TSL:1

Bcas1-204 ENSMUST00000133673.7 688 No protein - lncRNA - - TSL:1

Bcas1-209 ENSMUST00000156657.7 594 No protein - lncRNA - - TSL:3

Bcas1-206 ENSMUST00000147577.1 444 No protein - lncRNA - - TSL:2

Bcas1-207 ENSMUST00000152461.1 371 No protein - lncRNA - - TSL:3

Page 7 of 9 https://www.alphaknockout.com

100.86 kb Forward strand

170.34Mb 170.36Mb 170.38Mb 170.40Mb 170.42Mb Genes Bcas1os2-201 >lncRNA Bcas1os1-201 >lncRNA (Comprehensive set...

Contigs AL928812.11 > AL935134.10 > Genes (Comprehensive set... < Bcas1-203protein coding

< Bcas1-208protein coding

< Bcas1-202protein coding

< Bcas1-201protein coding

< Bcas1-209lncRNA < Bcas1-206lncRNA< Bcas1-205lncRNA

< Bcas1-204lncRNA

< Bcas1-207lncRNA

Regulatory Build

170.34Mb 170.36Mb 170.38Mb 170.40Mb 170.42Mb Reverse strand 100.86 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 8 of 9 https://www.alphaknockout.com

Transcript: ENSMUST00000013667

< Bcas1-201protein coding

Reverse strand 80.68 kb

ENSMUSP00000013... MobiDB lite Low complexity (Seg) PANTHER Novel Amplified in Breast Cancer-1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 633

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 9 of 9