https://www.alphaknockout.com

Mouse Slc52a3 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Slc52a3 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Slc52a3 (NCBI Reference Sequence: NM_027172 ; Ensembl: ENSMUSG00000027463 ) is located on Mouse 2. 5 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 5 (Transcript: ENSMUST00000073228). Exon 3 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Slc52a3 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-204D14 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit abnormal placental riboflavin transport and sudden neonatal death associated with hyperlipidemia and hypoglycemia due to riboflavin deficiency.

Exon 3 starts from about 40.51% of the coding region. The knockout of Exon 3 will result in frameshift of the gene. The size of intron 2 for 5'-loxP site insertion: 853 bp, and the size of intron 3 for 3'-loxP site insertion: 1461 bp. The size of effective cKO region: ~988 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Slc52a3 Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7488bp) | A(22.41% 1678) | C(26.36% 1974) | T(25.88% 1938) | G(25.35% 1898)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 152002283 152005282 3000 browser details YourSeq 42 904 965 3000 93.7% chr17 + 28917582 28917756 175 browser details YourSeq 37 2508 2607 3000 83.0% chr1 - 123896333 123896429 97 browser details YourSeq 30 2648 2685 3000 88.3% chr5 - 141607566 141607602 37 browser details YourSeq 30 2661 2692 3000 90.4% chr7 + 133554801 133554831 31 browser details YourSeq 29 1 79 3000 94.0% chr10 - 81277003 81277082 80 browser details YourSeq 28 924 966 3000 86.7% chr1 - 171929884 171929924 41 browser details YourSeq 27 22 79 3000 89.7% chr6 + 42319110 42319166 57 browser details YourSeq 26 54 79 3000 100.0% chr2 - 4664676 4664701 26 browser details YourSeq 25 2648 2685 3000 82.8% chr2 - 52530441 52530476 36 browser details YourSeq 25 51 79 3000 93.2% chr12 + 86760355 86760383 29 browser details YourSeq 23 2510 2533 3000 100.0% chr18 + 12710537 12710573 37 browser details YourSeq 23 1443 1467 3000 96.0% chr16 + 64440675 64440699 25 browser details YourSeq 21 2850 2882 3000 81.9% chr19 + 32588441 32588473 33

Note: The 3000 bp section upstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr2 + 152006271 152009270 3000 browser details YourSeq 63 365 684 3000 82.8% chr8 + 80083776 80084093 318 browser details YourSeq 59 432 593 3000 79.0% chr3 + 89805177 89805329 153 browser details YourSeq 53 437 594 3000 88.9% chr7 - 56285265 56285421 157 browser details YourSeq 48 392 478 3000 91.3% chr2 + 174917347 174944201 26855 browser details YourSeq 45 430 481 3000 94.2% chr17 + 6912626 6912678 53 browser details YourSeq 44 438 591 3000 83.4% chr2 + 73392008 73392162 155 browser details YourSeq 43 430 481 3000 92.2% chr9 - 98037732 98037785 54 browser details YourSeq 43 440 516 3000 87.8% chr5 - 96473024 96473098 75 browser details YourSeq 42 364 668 3000 92.0% chr3 + 125263336 125263714 379 browser details YourSeq 38 432 481 3000 88.1% chr4 + 19943887 19943934 48 browser details YourSeq 36 434 483 3000 97.4% chr2 + 10727591 10727641 51 browser details YourSeq 35 438 481 3000 85.4% chr2 - 13248041 13248082 42 browser details YourSeq 34 441 480 3000 92.5% chr3 - 95956786 95956825 40 browser details YourSeq 34 432 480 3000 89.5% chr6 + 119560061 119560108 48 browser details YourSeq 33 556 595 3000 92.4% chr4 - 100032361 100032559 199 browser details YourSeq 31 371 419 3000 75.0% chr6 - 100361606 100361645 40 browser details YourSeq 31 439 482 3000 85.8% chr15 - 33304251 33304292 42 browser details YourSeq 29 390 419 3000 100.0% chr12 + 116288995 116289025 31 browser details YourSeq 28 389 419 3000 96.7% chr8 + 24456555 24456586 32

Note: The 3000 bp section downstream of Exon 3 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Slc52a3 solute carrier protein family 52, member 3 [ Mus musculus (house mouse) ] Gene ID: 69698, updated on 24-Sep-2019

Gene summary

Official Symbol Slc52a3 provided by MGI Official Full Name solute carrier protein family 52, member 3 provided by MGI Primary source MGI:MGI:1916948 See related Ensembl:ENSMUSG00000027463 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as RFT2; 2310046K01Rik Expression Biased expression in small intestine adult (RPKM 57.0), large intestine adult (RPKM 55.9) and 12 other tissues See more Orthologs human all

Genomic context

Location: 2; 2 G3 See Slc52a3 in Genome Data Viewer

Exon count: 6

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (151996511..152009258)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (151822247..151834994)

Chromosome 2 - NC_000068.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Slc52a3 ENSMUSG00000027463

Description solute carrier protein family 52, member 3 [Source:MGI Symbol;Acc:MGI:1916948] Gene Synonyms 2310046K01Rik Location Chromosome 2: 151,996,511-152,009,258 forward strand. GRCm38:CM000995.2 About this gene This gene has 4 transcripts (splice variants), 202 orthologues, 1 paralogue, is a member of 1 Ensembl protein family and is associated with 21 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Slc52a3-201 ENSMUST00000073228.11 2669 460aa ENSMUSP00000072961.5 Protein coding CCDS16876 Q9D6X5 TSL:1 GENCODE basic APPRIS P1

Slc52a3-204 ENSMUST00000109861.7 2590 460aa ENSMUSP00000105487.1 Protein coding CCDS16876 Q9D6X5 TSL:1 GENCODE basic APPRIS P1

Slc52a3-203 ENSMUST00000109859.8 2204 250aa ENSMUSP00000105485.2 Protein coding CCDS50749 Q9D6X5 TSL:5 GENCODE basic

Slc52a3-202 ENSMUST00000109858.1 2094 250aa ENSMUSP00000105484.1 Protein coding CCDS50749 Q9D6X5 TSL:1 GENCODE basic

32.75 kb Forward strand

151.99Mb 152.00Mb 152.01Mb (Comprehensive set... Slc52a3-203 >protein coding

Slc52a3-201 >protein coding

Slc52a3-204 >protein coding

Slc52a3-202 >protein coding

Contigs AL845161.5 > Regulatory Build

151.99Mb 152.00Mb 152.01Mb Reverse strand 32.75 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000073228

12.72 kb Forward strand

Slc52a3-201 >protein coding

ENSMUSP00000072... Transmembrane heli... Low complexity (Seg) Pfam Solute carrier family 52, riboflavin transporter PANTHER Solute carrier family 52, riboflavin transporter

PTHR12929:SF4

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 460

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7