https://www.alphaknockout.com

Mouse Smarce1 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Smarce1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Smarce1 (NCBI Reference Sequence: NM_020618 ; Ensembl: ENSMUSG00000037935 ) is located on Mouse 11. 11 exons are identified, with the ATG start codon in exon 2 and the TAA stop codon in exon 11 (Transcript: ENSMUST00000103133). Exon 5~7 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Smarce1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-96J15 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit prenatal lethality.

Exon 5 starts from about 12.73% of the coding region. The knockout of Exon 5~7 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 4302 bp, and the size of intron 7 for 3'-loxP site insertion: 3933 bp. The size of effective cKO region: ~2016 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 5 6 7 11 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Smarce1 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8516bp) | A(27.84% 2371) | C(19.74% 1681) | T(30.82% 2625) | G(21.59% 1839)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 99220986 99223985 3000 browser details YourSeq 41 1501 1555 3000 97.8% chr7 - 87178077 87178351 275 browser details YourSeq 24 2685 2709 3000 100.0% chr12 + 66099250 66099275 26

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 99215970 99218969 3000 browser details YourSeq 472 112 1092 3000 88.3% chr1 - 155404975 155405990 1016 browser details YourSeq 418 443 1092 3000 87.1% chr11 - 94551858 94552418 561 browser details YourSeq 406 442 1090 3000 86.6% chr5 + 144007304 144007966 663 browser details YourSeq 400 444 1087 3000 87.7% chr17 - 47406081 47406745 665 browser details YourSeq 397 441 1092 3000 88.1% chr5 + 116000890 116001439 550 browser details YourSeq 396 444 1085 3000 89.1% chr17 + 15509408 15510071 664 browser details YourSeq 391 441 1092 3000 89.1% chr16 + 35762365 35763032 668 browser details YourSeq 390 441 1092 3000 86.0% chr13 - 40044957 40045520 564 browser details YourSeq 389 441 1092 3000 86.6% chr12 - 13395618 13396180 563 browser details YourSeq 382 441 1092 3000 86.1% chr1 - 23394992 23395554 563 browser details YourSeq 381 441 1069 3000 88.6% chr2 - 125127983 125128635 653 browser details YourSeq 381 441 1091 3000 85.0% chr15 - 41261124 41261685 562 browser details YourSeq 380 441 1092 3000 87.5% chr9 - 78028545 78029222 678 browser details YourSeq 379 441 1092 3000 85.9% chr5 - 33913774 33914328 555 browser details YourSeq 379 441 1092 3000 84.7% chr18 + 12560065 12560625 561 browser details YourSeq 378 441 1091 3000 84.5% chr5 + 138122127 138122686 560 browser details YourSeq 378 442 1091 3000 89.7% chr14 + 87394587 87395278 692 browser details YourSeq 377 441 1092 3000 85.4% chr17 - 87028849 87029404 556 browser details YourSeq 377 441 1092 3000 84.0% chr6 + 24475966 24476528 563

Note: The 3000 bp section downstream of Exon 7 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Smarce1 SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily e, member 1 [ Mus musculus (house mouse) ] Gene ID: 57376, updated on 12-Aug-2019

Gene summary

Official Symbol Smarce1 provided by MGI Official Full Name SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily e, member 1 provided by MGI Primary source MGI:MGI:1927347 See related Ensembl:ENSMUSG00000037935 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Baf27; 2810417B20Rik; 5830412H02Rik; 9030408N19Rik Expression Broad expression in CNS E11.5 (RPKM 73.8), CNS E14 (RPKM 55.2) and 23 other tissues See more Orthologs human all

Genomic context

Location: 11; 11 D See Smarce1 in Genome Data Viewer

Exon count: 11

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (99209047..99231017, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (99070361..99092331, complement)

Chromosome 11 - NC_000077.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Smarce1 ENSMUSG00000037935

Description SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily e, member 1 [Source:MGI Symbol;Acc:MGI:1927347] Gene Synonyms 2810417B20Rik, 5830412H02Rik, 9030408N19Rik, BAF57 Location Chromosome 11: 99,209,047-99,231,017 reverse strand. GRCm38:CM001004.2 About this gene This gene has 6 transcripts (splice variants), 312 orthologues, 33 paralogues, is a member of 1 Ensembl protein family and is associated with 1 phenotype. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Smarce1-201 ENSMUST00000103133.3 2919 411aa ENSMUSP00000099422.3 Protein coding CCDS25374 O54941 TSL:1 GENCODE basic APPRIS P1

Smarce1-202 ENSMUST00000128215.1 3117 No protein - lncRNA - - TSL:1

Smarce1-204 ENSMUST00000135040.7 761 No protein - lncRNA - - TSL:1

Smarce1-206 ENSMUST00000156160.1 610 No protein - lncRNA - - TSL:2

Smarce1-205 ENSMUST00000155062.1 481 No protein - lncRNA - - TSL:2

Smarce1-203 ENSMUST00000128707.7 456 No protein - lncRNA - - TSL:5

41.97 kb Forward strand 99.20Mb 99.21Mb 99.22Mb 99.23Mb 99.24Mb Contigs < AL590991.14 Genes (Comprehensive set... < Smarce1-201protein coding < Krt222-202protein coding

< Smarce1-206lncRNA < Smarce1-205lncRNA < Smarce1-202lncRNA < Krt222-201protein coding

< Smarce1-204lncRNA < Krt222-203lncRNA

< Smarce1-203lncRNA

Regulatory Build

99.20Mb 99.21Mb 99.22Mb 99.23Mb 99.24Mb Reverse strand 41.97 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000103133

< Smarce1-201protein coding

Reverse strand 21.97 kb

ENSMUSP00000099... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily High mobility group box domain superfamily

SMART High mobility group box domain

Pfam High mobility group box domain

PROSITE profiles High mobility group box domain PANTHER PTHR46232:SF2

SWI/SNF complex subunit BAF57 Gene3D High mobility group box domain superfamily CDD cd01390

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

stop gained missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 411

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7