https://www.alphaknockout.com

Mouse Fam114a2 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Fam114a2 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Fam114a2 (NCBI Reference Sequence: NM_001168668 ; Ensembl: ENSMUSG00000020523 ) is located on Mouse 11. 14 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 14 (Transcript: ENSMUST00000108850). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Fam114a2 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-206L21 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 26.49% of the coding region. The knockout of Exon 5 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 3775 bp, and the size of intron 5 for 3'-loxP site insertion: 1750 bp. The size of effective cKO region: ~589 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 5 6 14 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Fam114a2 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7089bp) | A(23.53% 1668) | C(21.27% 1508) | T(33.18% 2352) | G(22.02% 1561)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 57509717 57512716 3000 browser details YourSeq 77 735 1347 3000 84.6% chr1 - 134533115 134692576 159462 browser details YourSeq 74 1276 1354 3000 97.5% chr14 + 92605896 92605975 80 browser details YourSeq 71 1276 1359 3000 93.9% chr3 + 111018234 111018317 84 browser details YourSeq 70 1276 1354 3000 94.9% chr1 + 4924860 4924939 80 browser details YourSeq 69 1276 1354 3000 97.3% chr5 - 107911313 107911392 80 browser details YourSeq 68 1276 1354 3000 97.3% chr6 + 11197313 11197392 80 browser details YourSeq 66 1276 1350 3000 94.6% chr4 - 7722895 7722970 76 browser details YourSeq 66 1276 1354 3000 92.4% chr15 - 84788575 84788654 80 browser details YourSeq 66 1280 1352 3000 95.9% chr12 + 102650330 102650403 74 browser details YourSeq 65 1276 1350 3000 97.2% chr16 - 33041120 33041195 76 browser details YourSeq 64 1276 1362 3000 88.6% chr2 - 165981885 165981968 84 browser details YourSeq 64 1276 1350 3000 93.3% chrX + 42127110 42127185 76 browser details YourSeq 64 1276 1350 3000 94.5% chr9 + 99399940 99400015 76 browser details YourSeq 63 1276 1351 3000 93.2% chr8 - 109749520 109749596 77 browser details YourSeq 63 1276 1350 3000 95.8% chr2 - 33329573 33329648 76 browser details YourSeq 63 1276 1351 3000 94.4% chr14 - 59873553 59873629 77 browser details YourSeq 63 1276 1354 3000 93.3% chr10 - 62691835 62691914 80 browser details YourSeq 63 1276 1349 3000 93.2% chr12 + 71705357 71705431 75 browser details YourSeq 63 1276 1353 3000 91.1% chr10 + 61245111 61245189 79

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 57506128 57509127 3000 browser details YourSeq 204 697 1052 3000 84.6% chr15 - 85639092 85639418 327 browser details YourSeq 197 746 1312 3000 82.4% chr10 + 45038071 45038534 464 browser details YourSeq 196 746 1046 3000 85.8% chr18 + 78490966 78491280 315 browser details YourSeq 187 747 1047 3000 88.6% chr9 - 107281153 107281466 314 browser details YourSeq 187 746 1058 3000 84.5% chr2 + 120646536 120646835 300 browser details YourSeq 183 746 1056 3000 88.6% chr10 + 90727255 90727563 309 browser details YourSeq 182 746 1019 3000 88.3% chr7 + 75038619 75038893 275 browser details YourSeq 179 735 1058 3000 85.5% chr11 - 90434458 90434792 335 browser details YourSeq 178 746 1058 3000 86.2% chr10 + 69133264 69133590 327 browser details YourSeq 177 746 1058 3000 88.5% chr5 - 125584935 125585255 321 browser details YourSeq 177 746 1052 3000 84.9% chr10 - 115055057 115055374 318 browser details YourSeq 174 746 1056 3000 84.2% chr18 - 57314517 57314838 322 browser details YourSeq 173 746 1039 3000 88.5% chr9 - 84299313 84299620 308 browser details YourSeq 173 746 1030 3000 88.9% chr15 - 75882340 75882636 297 browser details YourSeq 173 746 1019 3000 86.5% chr6 + 37681898 37682177 280 browser details YourSeq 172 746 1052 3000 87.8% chr14 + 62675959 62676279 321 browser details YourSeq 171 746 1025 3000 81.1% chr6 + 100769618 100769877 260 browser details YourSeq 170 746 1047 3000 83.6% chr7 + 65998181 65998474 294 browser details YourSeq 169 752 1019 3000 83.4% chr5 - 103379040 103379310 271

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Fam114a2 family with sequence similarity 114, member A2 [ Mus musculus () ] Gene ID: 67726, updated on 12-Aug-2019

Gene summary

Official Symbol Fam114a2 provided by MGI Official Full Name family with sequence similarity 114, member A2 provided by MGI Primary source MGI:MGI:1917629 See related Ensembl:ENSMUSG00000020523 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 1810073G14Rik; 9030624B09Rik Expression Ubiquitous expression in CNS E14 (RPKM 12.4), whole brain E14.5 (RPKM 11.2) and 28 other tissues See more Orthologs human all

Genomic context

Location: 11; 11 B1.3 See Fam114a2 in Genome Data Viewer

Exon count: 15

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (57482990..57518656, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (57296492..57332146, complement)

Chromosome 11 - NC_000077.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Fam114a2 ENSMUSG00000020523

Description family with sequence similarity 114, member A2 [Source:MGI Symbol;Acc:MGI:1917629] Gene Synonyms 1810073G14Rik, 9030624B09Rik Location Chromosome 11: 57,482,993-57,518,617 reverse strand. GRCm38:CM001004.2 About this gene This gene has 4 transcripts (splice variants), 198 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Fam114a2-201 ENSMUST00000020831.12 2557 490aa ENSMUSP00000020831.6 Protein coding CCDS24717 Q8VE88 TSL:1 GENCODE basic APPRIS P3

Fam114a2-202 ENSMUST00000108850.1 1765 497aa ENSMUSP00000104478.1 Protein coding CCDS48801 Q8VE88 TSL:1 GENCODE basic APPRIS ALT2

Fam114a2-204 ENSMUST00000130596.1 759 No protein - lncRNA - - TSL:2

Fam114a2-203 ENSMUST00000129596.1 660 No protein - lncRNA - - TSL:3

55.62 kb Forward strand

57.48Mb 57.49Mb 57.50Mb 57.51Mb 57.52Mb Mfap3-203 >protein coding (Comprehensive set...

Mfap3-204 >retained intron

Mfap3-201 >protein coding

Mfap3-202 >protein coding

Contigs AL672236.18 >

Genes (Comprehensive set... < Fam114a2-201protein coding

< Fam114a2-202protein coding

< Fam114a2-203lncRNA

< Fam114a2-204lncRNA

Regulatory Build

57.48Mb 57.49Mb 57.50Mb 57.51Mb 57.52Mb Reverse strand 55.62 kb

Regulation Legend

CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000108850

< Fam114a2-202protein coding

Reverse strand 34.81 kb

ENSMUSP00000104... MobiDB lite Low complexity (Seg) Pfam Protein of unknown function DUF719 PANTHER PTHR12842:SF3

Protein of unknown function DUF719

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend inframe deletion missense variant splice region variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 497

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7