https://www.alphaknockout.com

Mouse Smarcc2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Smarcc2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Smarcc2 (NCBI Reference Sequence: NM_001114097 ; Ensembl: ENSMUSG00000025369 ) is located on Mouse 10. 28 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 28 (Transcript: ENSMUST00000105235). Exon 2~3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Smarcc2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-128E20 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a targeted allele exhibit a slight increase in embryo weight at E13.5 and die shortly after birth (P0-P3). Mice homozygous for a conditional allele activated in the brain exhibit reduced cerebral cortical size and thickness.

Exon 2 starts from about 3.08% of the coding region. The knockout of Exon 2~3 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 1914 bp, and the size of intron 3 for 3'-loxP site insertion: 625 bp. The size of effective cKO region: ~2215 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 28 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Homology arm Exon of mouse Smarcc2 cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8715bp) | A(25.54% 2226) | C(24.41% 2127) | T(25.57% 2228) | G(24.49% 2134)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 + 128458103 128461102 3000 browser details YourSeq 71 337 437 3000 91.8% chr13 - 21927863 21928085 223 browser details YourSeq 62 371 436 3000 97.0% chr11 - 69110926 69110991 66 browser details YourSeq 59 371 437 3000 94.1% chr13 - 23518152 23518218 67 browser details YourSeq 59 371 437 3000 94.1% chr13 + 21924944 21925010 67 browser details YourSeq 59 371 437 3000 94.1% chr13 + 21937103 21937169 67 browser details YourSeq 58 371 436 3000 94.0% chr10 + 63429485 63429550 66 browser details YourSeq 57 371 437 3000 92.6% chr4 - 10874070 10874136 67 browser details YourSeq 57 371 437 3000 92.6% chr13 - 21919221 21919287 67 browser details YourSeq 57 371 437 3000 92.6% chr11 - 69037483 69037549 67 browser details YourSeq 57 371 437 3000 92.6% chr13 + 23500966 23501032 67 browser details YourSeq 57 371 437 3000 92.6% chr13 + 21940923 21940989 67 browser details YourSeq 51 1192 1472 3000 93.2% chr19 + 41482671 41483087 417 browser details YourSeq 40 373 438 3000 80.4% chr13 + 21994201 21994266 66 browser details YourSeq 40 397 436 3000 100.0% chr13 + 22022454 22022493 40 browser details YourSeq 33 398 436 3000 92.4% chr19 + 12011123 12011161 39 browser details YourSeq 33 398 436 3000 92.4% chr10 + 12922608 12922646 39 browser details YourSeq 29 402 438 3000 89.2% chr2 + 119046790 119046826 37 browser details YourSeq 29 402 438 3000 89.2% chr13 + 21492885 21492921 37 browser details YourSeq 26 8 48 3000 85.8% chr16 - 34361939 34361977 39

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr10 + 128463318 128466317 3000 browser details YourSeq 133 1631 1785 3000 93.6% chr17 - 27607101 27607592 492 browser details YourSeq 121 1646 1783 3000 94.3% chrX - 154472825 154472975 151 browser details YourSeq 118 2597 2996 3000 92.2% chr11 + 115094760 115269033 174274 browser details YourSeq 116 1649 1785 3000 92.8% chr17 - 56951546 56951683 138 browser details YourSeq 111 1650 1784 3000 96.0% chr4 - 24417383 24417517 135 browser details YourSeq 109 1658 1784 3000 93.6% chr19 - 17152232 17152364 133 browser details YourSeq 108 2544 2997 3000 92.2% chr17 - 30161621 30162087 467 browser details YourSeq 105 1650 2270 3000 74.7% chr12 + 84504344 84504535 192 browser details YourSeq 94 1662 1784 3000 88.6% chr15 - 5261588 5261712 125 browser details YourSeq 92 1668 1781 3000 90.4% chr13 - 39515600 39515713 114 browser details YourSeq 92 1622 1783 3000 84.3% chr18 + 67960182 67960324 143 browser details YourSeq 89 1664 1780 3000 88.1% chr3 + 41355381 41355497 117 browser details YourSeq 86 2597 2999 3000 87.3% chr14 - 122192088 122192486 399 browser details YourSeq 84 1660 1785 3000 85.9% chr4 + 108563154 108563276 123 browser details YourSeq 82 2804 2996 3000 79.4% chr1 + 119250180 119250273 94 browser details YourSeq 81 2543 2982 3000 71.0% chr8 - 111217192 111217443 252 browser details YourSeq 81 2775 2997 3000 94.6% chr2 - 152812604 152812943 340 browser details YourSeq 80 1671 1774 3000 91.0% chr12 - 7786081 7786184 104 browser details YourSeq 80 2884 2999 3000 92.6% chr2 + 158110296 158110707 412

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Smarcc2 SWI/SNF related, matrix associated, actin dependent regulator of , subfamily c, member 2 [ Mus musculus (house mouse) ] Gene ID: 68094, updated on 15-Oct-2019

Gene summary

Official Symbol Smarcc2 provided by MGI Official Full Name SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily c, member 2 provided by MGI Primary source MGI:MGI:1915344 See related Ensembl:ENSMUSG00000025369 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 5930405J04Rik Expression Ubiquitous expression in CNS E14 (RPKM 85.9), whole brain E14.5 (RPKM 78.9) and 28 other tissues See more Orthologs human all

Genomic context

Location: 10; 10 D3 See Smarcc2 in Genome Data Viewer

Exon count: 29

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 10 NC_000076.6 (128459182..128490586)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 10 NC_000076.5 (127896292..127927230)

Chromosome 10 - NC_000076.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 6 transcripts

Gene: Smarcc2 ENSMUSG00000025369

Description SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily c, member 2 [Source:MGI Symbol;Acc:MGI:1915344] Gene Synonyms 5930405J04Rik Location Chromosome 10: 128,459,248-128,490,482 forward strand. GRCm38:CM001003.2 About this gene This gene has 6 transcripts (splice variants), 193 orthologues, 3 paralogues, is a member of 1 Ensembl protein family and is associated with 6 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Smarcc2- ENSMUST00000099131.10 4562 1130aa ENSMUSP00000096734.3 Protein coding CCDS48726 Q3UID0 TSL:1 202 GENCODE basic APPRIS ALT2

Smarcc2- ENSMUST00000026433.8 4533 1099aa ENSMUSP00000026433.7 Protein coding CCDS24278 Q6PDG5 TSL:1 201 GENCODE basic APPRIS P3

Smarcc2- ENSMUST00000105235.9 4032 1213aa ENSMUSP00000100868.2 Protein coding CCDS48727 Q6PDG5 TSL:1 203 GENCODE basic APPRIS ALT2

Smarcc2- ENSMUST00000218228.1 603 142aa ENSMUSP00000151333.1 Protein coding - A0A1W2P6N7 TSL:5 205 GENCODE basic

Smarcc2- ENSMUST00000217751.1 865 No protein - Retained intron - - TSL:5 204

Smarcc2- ENSMUST00000220384.1 735 No protein - lncRNA - - TSL:3 206

51.23 kb Forward strand 128.45Mb 128.46Mb 128.47Mb 128.48Mb 128.49Mb 128.50Mb (Comprehensive set... Mir8105-201 >miRNA Smarcc2-204 >retained intron

Smarcc2-205 >protein coding

Smarcc2-201 >protein coding

Smarcc2-202 >protein coding

Smarcc2-203 >protein coding

Smarcc2-206 >lncRNA

Contigs AC090489.8 > AC170752.2 >

Genes < Myl6-214nonsense mediated decay (Comprehensive set...

< Myl6-215nonsense mediated decay

< Myl6-201protein coding

< Myl6-209retained intron

< Myl6-206protein coding

< Myl6-202protein coding

< Myl6-212retained intron

Page 6 of 8 < Myl6-213lncRNA

< Myl6-205protein coding

< Myl6-211protein coding

< Myl6-203protein coding

< Myl6-208lncRNA

< Myl6-207retained intron

< Myl6-210lncRNA

< Myl6-204retained intron

< Myl6b-202retained intron

< Myl6b-201protein coding

Regulatory Build

128.45Mb 128.46Mb 128.47Mb 128.48Mb 128.49Mb 128.50Mb Reverse strand 51.23 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene 51.23 kb Forward strand 128.45Mb 128.46Mb 128.47Mb 128.48Mb 128.49Mb 128.50Mb Genes (Comprehensive set... Mir8105-201 >miRNA Smarcc2-204 >retained intron

Smarcc2-205 >protein coding

Smarcc2-201 >protein coding

Smarcc2-202 >protein coding

Smarcc2-203 >protein coding

Smarcc2-206 >lncRNA

Contigs AC090489.8 > AC170752.2 > Genes < Myl6-214nonsense mediated decay (Comprehensive set...

< Myl6-215nonsense mediated decay

< Myl6-201protein coding

< Myl6-209retained intron

< Myl6-206protein coding

< Myl6-202protein coding

< Myl6-212retained intron https://www.alphaknockout.com

< Myl6-213lncRNA

< Myl6-205protein coding

< Myl6-211protein coding

< Myl6-203protein coding

< Myl6-208lncRNA

< Myl6-207retained intron

< Myl6-210lncRNA

< Myl6-204retained intron

< Myl6b-202retained intron

< Myl6b-201protein coding

Regulatory Build

128.45Mb 128.46Mb 128.47Mb 128.48Mb 128.49Mb 128.50Mb Reverse strand 51.23 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000105235

30.04 kb Forward strand

Smarcc2-203 >protein coding

ENSMUSP00000100... MobiDB lite Low complexity (Seg) Superfamily BRCT domain superfamily Homeobox-like domain superfamily SMART Chromo/chromo shadow domain SANT/Myb domain

Pfam SMARCC, N-terminal SWIRM domain SMARCC, SWIRM-associated domain

SANT/Myb domain SMARCC, C-terminal PROSITE profiles SWIRM domain SANT domain

PANTHER PTHR12802

SWI/SNF complex subunit SMARCC2 Gene3D Winged helix-like DNA-binding domain superfamily

1.10.10.60 CDD SANT/Myb domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 200 400 600 800 1000 1213

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8