https://www.alphaknockout.com

Mouse Surf6 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Surf6 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Surf6 (NCBI Reference Sequence: NM_009298 ; Ensembl: ENSMUSG00000036160 ) is located on Mouse 2. 5 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 5 (Transcript: ENSMUST00000047632). Exon 2~3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Surf6 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-356J14 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 8.92% of the coding region. The knockout of Exon 2~3 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 2761 bp, and the size of intron 3 for 3'-loxP site insertion: 6269 bp. The size of effective cKO region: ~1096 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 5 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Surf6 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7596bp) | A(23.22% 1764) | C(24.5% 1861) | T(28.3% 2150) | G(23.97% 1821)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 26900183 26903182 3000 browser details YourSeq 246 1334 2784 3000 91.4% chr1 + 60184521 60660651 476131 browser details YourSeq 222 1346 2781 3000 90.9% chr1 + 138670391 138981600 311210 browser details YourSeq 117 2014 2526 3000 89.8% chr19 + 25274469 25275201 733 browser details YourSeq 109 2600 2785 3000 89.2% chr9 - 15289738 15290228 491 browser details YourSeq 104 2602 2785 3000 83.4% chr18 - 23458931 23459107 177 browser details YourSeq 104 1371 2741 3000 90.0% chr18 + 67618324 67626908 8585 browser details YourSeq 103 2648 2781 3000 89.9% chr12 + 69612048 69612184 137 browser details YourSeq 101 2660 2785 3000 88.0% chr1 + 86209730 86209853 124 browser details YourSeq 98 2332 2561 3000 87.2% chr1 - 86692676 86693005 330 browser details YourSeq 97 2332 2526 3000 87.2% chrX + 102226450 102226756 307 browser details YourSeq 95 2660 2779 3000 87.3% chr4 + 126347465 126347582 118 browser details YourSeq 94 1988 2414 3000 88.8% chr8 - 27719305 27719780 476 browser details YourSeq 92 2660 2789 3000 90.4% chr6 - 38361862 38362000 139 browser details YourSeq 92 2332 2526 3000 89.8% chr13 - 38970960 38971274 315 browser details YourSeq 92 2622 2778 3000 87.0% chr1 + 86409923 86410105 183 browser details YourSeq 91 2600 2768 3000 88.3% chr4 + 130221021 130221316 296 browser details YourSeq 89 1342 1462 3000 95.1% chrX + 152790153 152790523 371 browser details YourSeq 89 2660 2781 3000 88.8% chr4 + 116648719 116648838 120 browser details YourSeq 88 2660 2779 3000 86.7% chr8 + 111116407 111116526 120

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 - 26896087 26899086 3000 browser details YourSeq 808 1078 2788 3000 88.5% chr19 + 58475007 58686430 211424 browser details YourSeq 714 1034 2441 3000 85.7% chr7 - 16539607 16540904 1298 browser details YourSeq 708 1164 2684 3000 84.6% chr7 + 118016396 118017787 1392 browser details YourSeq 703 1307 2593 3000 86.1% chr2 - 127629866 127631007 1142 browser details YourSeq 701 1095 2536 3000 85.0% chr11 - 94552496 94553757 1262 browser details YourSeq 699 1128 2788 3000 83.9% chr12 + 57558537 57560134 1598 browser details YourSeq 697 1078 2536 3000 85.0% chr13 + 110371695 110372990 1296 browser details YourSeq 670 1034 2399 3000 84.9% chr2 - 157359289 157360523 1235 browser details YourSeq 670 1034 2441 3000 85.3% chr2 - 136421287 136422581 1295 browser details YourSeq 669 1052 2398 3000 85.2% chr5 + 41954605 41955836 1232 browser details YourSeq 663 1097 2399 3000 84.8% chr1 - 34698700 34699854 1155 browser details YourSeq 660 1078 2769 3000 86.7% chr4 - 54995311 54996864 1554 browser details YourSeq 652 1094 2439 3000 86.2% chr9 - 48631337 48632573 1237 browser details YourSeq 649 1089 2430 3000 86.7% chr13 + 18726062 18727275 1214 browser details YourSeq 639 1034 2365 3000 85.1% chr8 - 124999981 125001204 1224 browser details YourSeq 631 1034 2398 3000 85.6% chr3 - 41890161 41891401 1241 browser details YourSeq 623 1073 2399 3000 84.6% chr2 + 27660413 27661584 1172 browser details YourSeq 620 1053 2500 3000 85.7% chr14 + 65769805 65771120 1316 browser details YourSeq 616 1034 2403 3000 84.0% chrX + 104935293 104936524 1232

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Surf6 surfeit gene 6 [ Mus musculus (house mouse) ] Gene ID: 20935, updated on 10-Oct-2019

Gene summary

Official Symbol Surf6 provided by MGI Official Full Name surfeit gene 6 provided by MGI Primary source MGI:MGI:98447 See related Ensembl:ENSMUSG00000036160 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Surf-6; D2Wsu129e Expression Ubiquitous expression in CNS E11.5 (RPKM 11.1), CNS E14 (RPKM 7.6) and 28 other tissues See more Orthologs human all

Genomic context

Location: 2 A3; 2 19.08 cM See Surf6 in Genome Data Viewer

Exon count: 5

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (26890418..26902887, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (26746292..26758333, complement)

Chromosome 2 - NC_000068.7

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 7 transcripts

Gene: Surf6 ENSMUSG00000036160

Description surfeit gene 6 [Source:MGI Symbol;Acc:MGI:98447] Gene Synonyms D2Wsu129e, Surf-6 Location Chromosome 2: 26,888,628-26,902,879 reverse strand. GRCm38:CM000995.2 About this gene This gene has 7 transcripts (splice variants), 181 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Surf6-201 ENSMUST00000047632.13 2971 355aa ENSMUSP00000048457.7 Protein coding CCDS15812 P70279 Q3V1X4 TSL:1 GENCODE basic APPRIS P1

Surf6-202 ENSMUST00000114043.1 894 208aa ENSMUSP00000109677.1 Protein coding - A2ALA0 TSL:5 GENCODE basic

Surf6-206 ENSMUST00000140392.1 1372 No protein - lncRNA - - TSL:1

Surf6-207 ENSMUST00000142131.1 748 No protein - lncRNA - - TSL:2

Surf6-205 ENSMUST00000137904.7 680 No protein - lncRNA - - TSL:3

Surf6-204 ENSMUST00000129554.1 653 No protein - lncRNA - - TSL:1

Surf6-203 ENSMUST00000127691.1 372 No protein - lncRNA - - TSL:3

Page 6 of 8 https://www.alphaknockout.com

34.25 kb Forward strand 26.88Mb 26.89Mb 26.90Mb 26.91Mb Rpl7a-201 >protein coding (Comprehensive set...

Gm23969-201 >snoRNA

Gm22879-201 >snoRNA

Gm24134-201 >snoRNA

Rpl7a-203 >lncRNA

Rpl7a-204 >lncRNA

Rpl7a-205 >lncRNA

Rpl7a-202 >lncRNA

Rpl7a-206 >lncRNA

Rpl7a-207 >lncRNA

Contigs AL773563.12 > Genes (Comprehensive set... < Surf6-206lncRNA < Surf6-207lncRNA < Med22-202protein coding

< Surf6-204lncRNA < Surf6-203lncRNA < Med22-201protein coding

< Surf6-205lncRNA < Med22-203protein coding

< Surf6-201protein coding

< Surf6-202protein coding

Regulatory Build

26.88Mb 26.89Mb 26.90Mb 26.91Mb Reverse strand 34.25 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000047632

< Surf6-201protein coding

Reverse strand 12.45 kb

ENSMUSP00000048... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Pfam Ribosomal RNA-processing protein 14/surfeit locus protein 6, C-terminal domain PANTHER Surfeit locus 6

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 355

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8