https://www.alphaknockout.com

Mouse Fam49a Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Fam49a conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Fam49a (NCBI Reference Sequence: NM_029758 ; Ensembl: ENSMUSG00000020589 ) is located on Mouse 12. 13 exons are identified, with the ATG start codon in exon 4 and the TAG stop codon in exon 13 (Transcript: ENSMUST00000069066). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Fam49a gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-184A24 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note:

Exon 5 starts from about 7.33% of the coding region. The knockout of Exon 5 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 17286 bp, and the size of intron 5 for 3'-loxP site insertion: 1046 bp. The size of effective cKO region: ~622 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 5 6 13 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Fam49a Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7122bp) | A(27.7% 1973) | C(21.08% 1501) | T(30.16% 2148) | G(21.06% 1500)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr12 + 12354796 12357795 3000 browser details YourSeq 89 261 473 3000 88.8% chr12 - 99172262 99172721 460 browser details YourSeq 83 313 497 3000 80.9% chr9 - 114372378 114647406 275029 browser details YourSeq 82 258 508 3000 71.9% chr9 - 83053532 83053729 198 browser details YourSeq 81 393 517 3000 83.2% chr12 - 75295213 75295339 127 browser details YourSeq 79 383 508 3000 81.8% chr9 + 73112490 73112616 127 browser details YourSeq 75 348 508 3000 85.8% chr17 + 29461244 29461412 169 browser details YourSeq 75 300 498 3000 81.9% chr11 + 29579728 29580120 393 browser details YourSeq 74 383 512 3000 79.3% chr10 + 69390222 69471093 80872 browser details YourSeq 72 384 508 3000 79.2% chr2 + 163354950 163355075 126 browser details YourSeq 71 392 508 3000 80.3% chr11 - 93512255 93512360 106 browser details YourSeq 71 351 498 3000 78.4% chr11 + 53684396 53684545 150 browser details YourSeq 70 402 513 3000 82.2% chr15 + 89742403 89742516 114 browser details YourSeq 69 383 493 3000 83.7% chr16 - 31152993 31153105 113 browser details YourSeq 69 353 498 3000 80.2% chr5 + 107090845 107090990 146 browser details YourSeq 69 402 506 3000 83.9% chr10 + 116414322 116414428 107 browser details YourSeq 67 402 506 3000 82.0% chr5 + 52210213 52210317 105 browser details YourSeq 66 384 517 3000 84.6% chr6 + 53602038 53602171 134 browser details YourSeq 66 402 512 3000 96.0% chr4 + 19484000 19484116 117 browser details YourSeq 65 393 504 3000 79.5% chr17 + 26862352 26862464 113

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr12 + 12358418 12361417 3000 browser details YourSeq 84 1266 1402 3000 83.4% chr14 + 119250333 119250468 136 browser details YourSeq 83 1257 1395 3000 82.9% chr16 - 30048314 30048848 535 browser details YourSeq 79 1271 1405 3000 80.8% chr19 - 6834645 6834773 129 browser details YourSeq 77 1277 1405 3000 81.8% chr12 - 102298611 102298738 128 browser details YourSeq 77 1264 1404 3000 89.6% chr11 - 97028843 97029175 333 browser details YourSeq 75 1263 1402 3000 86.5% chr1 - 135423432 135423573 142 browser details YourSeq 74 1269 1404 3000 78.9% chr10 - 126687410 126687541 132 browser details YourSeq 74 1262 1403 3000 80.4% chr19 + 53048314 53048455 142 browser details YourSeq 73 1264 1404 3000 85.6% chr16 - 96212508 96212647 140 browser details YourSeq 72 1271 1394 3000 75.3% chr1 + 22346766 22346871 106 browser details YourSeq 70 1276 1409 3000 82.5% chr5 + 64718138 64718283 146 browser details YourSeq 70 1309 1402 3000 87.3% chr14 + 30193883 30193976 94 browser details YourSeq 69 1276 1405 3000 80.4% chr11 - 106958908 106959036 129 browser details YourSeq 68 1272 1407 3000 79.6% chr10 + 70399157 70399288 132 browser details YourSeq 66 1275 1405 3000 93.5% chr10 - 82755127 82755258 132 browser details YourSeq 64 1306 1405 3000 82.0% chr13 - 108171888 108171987 100 browser details YourSeq 63 1264 1402 3000 76.6% chr6 + 38170531 38170668 138 browser details YourSeq 62 1314 1405 3000 85.3% chr11 + 54813769 54813861 93 browser details YourSeq 62 1314 1411 3000 90.0% chr1 + 60002666 60003102 437

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Fam49a family with sequence similarity 49, member A [ Mus musculus (house mouse) ] Gene ID: 76820, updated on 24-Oct-2019

Gene summary

Official Symbol Fam49a provided by MGI Official Full Name family with sequence similarity 49, member A provided by MGI Primary source MGI:MGI:1261783 See related Ensembl:ENSMUSG00000020589 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 9630047E15; D12Ertd553e; 2410157M17Rik Expression Broad expression in CNS E18 (RPKM 20.0), frontal lobe adult (RPKM 15.6) and 15 other tissues See more Orthologs human all

Genomic context

Location: 12 A1.1; 12 5.63 cM See Fam49a in Genome Data Viewer

Exon count: 14

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 12 NC_000078.6 (12262134..12380965)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 12 NC_000078.5 (12268945..12383169)

Chromosome 12 - NC_000078.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Fam49a ENSMUSG00000020589

Description family with sequence similarity 49, member A [Source:MGI Symbol;Acc:MGI:1261783] Gene Synonyms 2410157M17Rik, D12Ertd553e Location Chromosome 12: 12,262,139-12,380,965 forward strand. GRCm38:CM001005.2 About this gene This gene has 5 transcripts (splice variants), 201 orthologues, 1 paralogue and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Fam49a-202 ENSMUST00000069066.13 9452 323aa ENSMUSP00000065613.6 Protein coding CCDS25817 Q8BHZ0 TSL:1 GENCODE basic APPRIS P1

Fam49a-205 ENSMUST00000223061.1 1782 323aa ENSMUSP00000152252.1 Protein coding CCDS25817 Q8BHZ0 TSL:5 GENCODE basic APPRIS P1

Fam49a-201 ENSMUST00000069005.9 1771 323aa ENSMUSP00000068125.8 Protein coding CCDS25817 Q8BHZ0 TSL:1 GENCODE basic APPRIS P1

Fam49a-204 ENSMUST00000222357.1 3019 No protein - Retained intron - - TSL:1

Fam49a-203 ENSMUST00000221782.1 689 No protein - lncRNA - - TSL:3

138.83 kb Forward strand 12.26Mb 12.28Mb 12.30Mb 12.32Mb 12.34Mb 12.36Mb 12.38Mb (Comprehensive set... Fam49a-202 >protein coding

Fam49a-203 >lncRNA

Fam49a-201 >protein coding

Fam49a-205 >protein coding

Fam49a-204 >retained intron

Contigs < AC110171.29 CT010452.9 > Genes < 4921511I17Rik-201lncRNA (Comprehensive set...

Regulatory Build

12.26Mb 12.28Mb 12.30Mb 12.32Mb 12.34Mb 12.36Mb 12.38Mb Reverse strand 138.83 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000069066

118.83 kb Forward strand

Fam49a-202 >protein coding

ENSMUSP00000065... Pfam Protein of unknown function DUF1394 PANTHER Protein FAM49

PTHR12422:SF4

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend synonymous variant

Scale bar 0 40 80 120 160 200 240 280 323

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7