https://www.alphaknockout.com

Mouse Fam185a Knockout Project (CRISPR/Cas9)

Objective: To create a Fam185a knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Fam185a (NCBI Reference Sequence: NM_177869 ; Ensembl: ENSMUSG00000047221 ) is located on Mouse 5. 8 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 8 (Transcript: ENSMUST00000056045). Exon 2~3 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 2 starts from about 36.16% of the coding region. Exon 2~3 covers 17.9% of the coding region. The size of effective KO region: ~4060 bp. The KO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 8

Legends Exon of mouse Fam185a Knockout region

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot (up) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the Dot Plot (down) Window size: 15 bp

Forward Reverse Complement

Sequence 12

Note: The 2000 bp section downstream of Exon 3 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats.

Page 3 of 8 https://www.alphaknockout.com

Overview of the GC Content Distribution (up) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(21.6% 432) | C(22.5% 450) | T(34.95% 699) | G(20.95% 419)

Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution (down) Window size: 300 bp

Sequence 12

Summary: Full Length(2000bp) | A(25.85% 517) | C(19.0% 380) | T(33.05% 661) | G(22.1% 442)

Note: The 2000 bp section downstream of Exon 3 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 4 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr5 + 21427725 21429724 2000 browser details YourSeq 1126 58 1409 2000 93.0% chr10 - 39037630 39038988 1359 browser details YourSeq 1116 65 1411 2000 92.6% chr11 - 5658416 5659774 1359 browser details YourSeq 1103 66 1387 2000 92.9% chr9 - 52847075 52848407 1333 browser details YourSeq 1103 65 1413 2000 92.2% chr17 - 24788527 24789894 1368 browser details YourSeq 1097 55 1413 2000 92.5% chr7 + 114378337 114379745 1409 browser details YourSeq 1093 66 1390 2000 92.2% chr3 - 157456996 157458332 1337 browser details YourSeq 1088 52 1413 2000 92.2% chr9 - 124461258 124462619 1362 browser details YourSeq 1088 55 1406 2000 92.1% chr10 + 24248692 24250043 1352 browser details YourSeq 1084 65 1408 2000 91.4% chrX + 14214162 14859011 644850 browser details YourSeq 1082 41 1410 2000 91.7% chr10 + 72500320 72501691 1372 browser details YourSeq 1080 52 1344 2000 93.5% chr10 + 58279097 58280396 1300 browser details YourSeq 1076 65 1413 2000 90.6% chr10 + 34798375 34799719 1345 browser details YourSeq 1074 66 1399 2000 92.6% chr18 - 50489129 50490469 1341 browser details YourSeq 1068 59 1413 2000 91.5% chr16 - 16694473 16695821 1349 browser details YourSeq 1067 59 1408 2000 91.7% chr9 - 79536115 79537471 1357 browser details YourSeq 1067 66 1390 2000 91.8% chr7 - 107154684 107156005 1322 browser details YourSeq 1064 65 1375 2000 92.2% chr7 - 92458997 92460316 1320 browser details YourSeq 1063 75 1340 2000 93.0% chr14 - 120717891 120719164 1274 browser details YourSeq 1063 38 1404 2000 91.2% chr14 - 14443880 14445246 1367

Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 2000 1 2000 2000 100.0% chr5 + 21433785 21435784 2000 browser details YourSeq 149 1408 1608 2000 90.4% chr9 - 21849397 21849599 203 browser details YourSeq 149 1428 1624 2000 89.9% chr10 + 56603139 56603544 406 browser details YourSeq 148 1427 1624 2000 93.1% chr14 + 20646678 20647142 465 browser details YourSeq 146 1407 1605 2000 89.3% chr8 + 105922456 105972446 49991 browser details YourSeq 143 1410 1605 2000 88.9% chr11 + 15719642 15719951 310 browser details YourSeq 143 1419 1604 2000 86.9% chr1 + 69739676 69739858 183 browser details YourSeq 142 1420 1617 2000 89.5% chr6 - 43930931 43931131 201 browser details YourSeq 142 1405 1600 2000 88.7% chr15 - 28291493 28291697 205 browser details YourSeq 140 1425 1605 2000 89.4% chr6 + 51491375 51491559 185 browser details YourSeq 139 1401 1601 2000 89.1% chr5 + 18907755 18907959 205 browser details YourSeq 138 1414 1606 2000 87.1% chr5 - 48838120 48838314 195 browser details YourSeq 137 1425 1607 2000 86.0% chr10 - 76102829 76103000 172 browser details YourSeq 137 1433 1605 2000 88.7% chr7 + 90226863 90227033 171 browser details YourSeq 136 1414 1601 2000 88.7% chr19 - 10011278 10011465 188 browser details YourSeq 134 1416 1601 2000 87.1% chr3 - 116747491 116747673 183 browser details YourSeq 134 1427 1604 2000 84.9% chr9 + 78385759 78385923 165 browser details YourSeq 134 1441 1605 2000 91.0% chr6 + 111062872 111063039 168 browser details YourSeq 134 1419 1605 2000 89.1% chr1 + 132181493 132181688 196 browser details YourSeq 133 1427 1601 2000 88.6% chr1 - 81846436 81846625 190

Note: The 2000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 5 of 8 https://www.alphaknockout.com

Gene and information: Fam185a family with sequence similarity 185, member A [ Mus musculus (house mouse) ] Gene ID: 330050, updated on 13-Aug-2019

Gene summary

Official Symbol Fam185a provided by MGI Official Full Name family with sequence similarity 185, member A provided by MGI Primary source MGI:MGI:2140983 See related Ensembl:ENSMUSG00000047221 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as AI847670; 4631428I10 Expression Ubiquitous expression in testis adult (RPKM 1.8), frontal lobe adult (RPKM 1.5) and 28 other tissues See more Orthologs human all

Genomic context

Location: 5; 5 A3 See Fam185a in Genome Data Viewer Exon count: 10

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 5 NC_000071.6 (21424861..21488271)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 5 NC_000071.5 (20930721..20987942)

Chromosome 5 - NC_000071.6

Page 6 of 8 https://www.alphaknockout.com

Transcript information: This gene has 3 transcripts

Gene: Fam185a ENSMUSG00000047221

Description family with sequence similarity 185, member A [Source:MGI Symbol;Acc:MGI:2140983] Location Chromosome 5: 21,424,958-21,482,124 forward strand. GRCm38:CM000998.2 About this gene This gene has 3 transcripts (splice variants), 176 orthologues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Fam185a-201 ENSMUST00000056045.4 3027 378aa ENSMUSP00000058333.4 Protein coding CCDS39022 Q7TPD2 TSL:1 GENCODE basic APPRIS P1

Fam185a-203 ENSMUST00000153301.1 1590 No protein - Retained intron - - TSL:1

Fam185a-202 ENSMUST00000146056.1 941 No protein - lncRNA - - TSL:3

77.17 kb Forward strand

Genes (Comprehensive set... Fam185a-201 >protein coding

Fam185a-203 >retained intron

Fam185a-202 >lncRNA

Contigs < AC157936.5 Genes < Ccdc146-202protein coding < Fbxl13-203retained intron (Comprehensive set...

< Ccdc146-201protein coding < Fbxl13-201protein coding

Regulatory Build

Reverse strand 77.17 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

processed transcript RNA gene

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000056045

57.17 kb Forward strand

Fam185a-201 >protein coding

ENSMUSP00000058... Low complexity (Seg) Pfam Putative adhesin PANTHER PTHR34094

PTHR34094:SF1

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant splice region variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 378

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8