https://www.alphaknockout.com

Mouse Slc22a14 Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Slc22a14 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Slc22a14 (NCBI Reference Sequence: NM_001037749 ; Ensembl: ENSMUSG00000070280 ) is located on Mouse 9. 12 exons are identified, with the ATG start codon in exon 3 and the TAG stop codon in exon 12 (Transcript: ENSMUST00000093775). Exon 4~5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Slc22a14 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-129C12 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit severe male infertility associated with asthenozoospermia, impaired sperm capacitation, decreased fertilization frequency, abnormal sperm flagellar bending, and abnormal sperm annulus morphology.

Exon 4 starts from about 27.24% of the coding region. The knockout of Exon 4~5 will result in frameshift of the gene. The size of intron 3 for 5'-loxP site insertion: 627 bp, and the size of intron 5 for 3'-loxP site insertion: 859 bp. The size of effective cKO region: ~959 bp. The cKO region does not have any other known gene.

Page 1 of 8 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 3 4 5 6 12 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Slc22a14 Homology arm cKO region loxP site

Page 2 of 8 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7459bp) | A(24.13% 1800) | C(24.76% 1847) | T(26.22% 1956) | G(24.88% 1856)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 8 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr9 - 119180138 119183137 3000 browser details YourSeq 159 914 1483 3000 86.1% chr8 + 70625906 70626325 420 browser details YourSeq 144 921 1088 3000 92.2% chr12 - 77150656 77150822 167 browser details YourSeq 144 916 1083 3000 91.0% chr1 - 190963661 190963826 166 browser details YourSeq 143 802 1083 3000 93.9% chr4 - 134694535 134695019 485 browser details YourSeq 142 918 1084 3000 91.5% chr4 - 48794236 48794389 154 browser details YourSeq 140 921 1074 3000 96.2% chr2 - 69802560 69802728 169 browser details YourSeq 138 918 1085 3000 91.8% chr9 - 42529570 42529735 166 browser details YourSeq 138 914 1085 3000 90.8% chr13 - 105191691 105191860 170 browser details YourSeq 137 889 1086 3000 85.9% chr6 - 79593951 79594116 166 browser details YourSeq 137 921 1082 3000 91.1% chr14 - 59580876 59581034 159 browser details YourSeq 137 921 1084 3000 94.8% chrX + 74441554 74441717 164 browser details YourSeq 136 909 1073 3000 92.1% chr15 - 91450376 91450541 166 browser details YourSeq 136 901 1082 3000 96.6% chr13 - 26619361 26619854 494 browser details YourSeq 136 921 1081 3000 94.8% chr5 + 93395540 93395700 161 browser details YourSeq 135 917 1083 3000 89.5% chr8 - 123099695 123099848 154 browser details YourSeq 135 921 1083 3000 89.2% chr1 - 156716450 156716605 156 browser details YourSeq 134 914 1071 3000 90.1% chr2 - 162007445 162007595 151 browser details YourSeq 134 921 1083 3000 91.1% chr13 - 55605036 55605196 161 browser details YourSeq 134 922 1074 3000 94.8% chr11 + 107664060 107664213 154

Note: The 3000 bp section upstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr9 - 119176179 119179178 3000 browser details YourSeq 62 2017 2133 3000 84.3% chr2 + 7172534 7172638 105 browser details YourSeq 37 1839 1883 3000 91.2% chr15 - 78371035 78371079 45 browser details YourSeq 36 1841 1882 3000 95.0% chr1 + 112761935 112761977 43 browser details YourSeq 35 1840 1882 3000 90.7% chr5 + 147212848 147212890 43 browser details YourSeq 35 2017 2053 3000 97.3% chr10 + 114984953 114984989 37 browser details YourSeq 34 1841 1882 3000 90.5% chr11 - 3148431 3148472 42 browser details YourSeq 34 1840 1877 3000 94.8% chr2 + 31526251 31526288 38 browser details YourSeq 32 1841 1878 3000 92.2% chr8 - 85559944 85559981 38 browser details YourSeq 31 1840 1882 3000 86.1% chr11 + 29390379 29390421 43 browser details YourSeq 29 1839 1877 3000 87.2% chr10 - 128288426 128288464 39 browser details YourSeq 28 1844 1875 3000 93.8% chr7 + 92613445 92613476 32 browser details YourSeq 27 1840 1872 3000 91.0% chr18 - 42695936 42695968 33 browser details YourSeq 20 1969 2002 3000 79.5% chr15 + 78280095 78280128 34

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 8 https://www.alphaknockout.com

Gene and information: Slc22a14 solute carrier family 22 (organic cation transporter), member 14 [ Mus musculus (house mouse) ] Gene ID: 382113, updated on 3-Sep-2019

Gene summary

Official Symbol Slc22a14 provided by MGI Official Full Name solute carrier family 22 (organic cation transporter), member 14 provided by MGI Primary source MGI:MGI:2685974 See related Ensembl:ENSMUSG00000070280 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Gm1128 Expression Restricted expression toward testis adult (RPKM 114.3) See more Orthologs human all

Genomic context

Location: 9; 9 F3 See Slc22a14 in Genome Data Viewer

Exon count: 13

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 9 NC_000075.6 (119169453..119190430, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 9 NC_000075.5 (119078574..119099511, complement)

Chromosome 9 - NC_000075.6

Page 5 of 8 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Slc22a14 ENSMUSG00000070280

Description solute carrier family 22 (organic cation transporter), member 14 [Source:MGI Symbol;Acc:MGI:2685974] Gene Synonyms LOC382113 Location Chromosome 9: 119,169,455-119,365,553 reverse strand. GRCm38:CM001002.2 About this gene This gene has 4 transcripts (splice variants), 445 orthologues, 26 paralogues and is a member of 1 Ensembl protein family. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Slc22a14-201 ENSMUST00000093775.11 2213 629aa ENSMUSP00000091289.5 Protein coding CCDS23609 Q497L9 TSL:1 GENCODE basic APPRIS P1

Slc22a14-204 ENSMUST00000170400.8 2110 629aa ENSMUSP00000131982.2 Protein coding CCDS23609 Q497L9 TSL:5 GENCODE basic APPRIS P1

Slc22a14-203 ENSMUST00000152061.1 589 196aa ENSMUSP00000117967.1 Protein coding - F7AMC9 CDS 5' and 3' incomplete TSL:3

Slc22a14-202 ENSMUST00000127794.1 340 20aa ENSMUSP00000120144.1 Protein coding - D3YUH1 CDS 3' incomplete TSL:5

Page 6 of 8 https://www.alphaknockout.com

216.10 kb Forward strand 119.20Mb 119.25Mb 119.30Mb 119.35Mb Gm47289-201 >lncRNA Acaa1a-208 >nonsense mediated decay (Comprehensive set...

Gm10608-201 >processed pseudogene Xylb-204 >nonsense mediated decay

Xylb-205 >nonsense mediated decay

Acaa1a-206 >protein coding

Acaa1a-210 >protein coding

Acaa1a-201 >protein coding

Acaa1a-209 >retained intron

Acaa1a-204 >retained intron

Acaa1a-207 >protein coding

Acaa1a-203 >protein coding

Acaa1a-205 >retained intron

Acaa1a-202 >retained intron

Xylb-201 >protein coding

Xylb-203 >lncRNA

Contigs < AC055818.9 < AC128702.4 Genes (Comprehensive set... < Slc22a14-201protein coding < Slc22a13b-201polymorphic pseudogene < Gm22729-201snoRNA< Myd88-201protein coding

< Slc22a14-204protein coding

< Slc22a14-203protein coding < Oxsr1-201protein coding < Myd88-203retained intron

< Slc22a14-202protein coding < Oxsr1-202nonsense mediated decay < Myd88-202protein coding

< Slc22a13-201protein coding < Oxsr1-203retained intron

< Oxsr1-204nonsense mediated decay

Regulatory Build

119.20Mb 119.25Mb 119.30Mb 119.35Mb Reverse strand 216.10 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

RNA gene pseudogene processed transcript

Page 7 of 8 https://www.alphaknockout.com

Transcript: ENSMUST00000093775

< Slc22a14-201protein coding

Reverse strand 20.95 kb

ENSMUSP00000091... Transmembrane heli... MobiDB lite Low complexity (Seg) Superfamily MFS transporter superfamily Pfam Major facilitator, sugar transporter-like PROSITE profiles Major facilitator superfamily domain PANTHER PTHR24064:SF48

PTHR24064 Gene3D 1.20.1250.20 CDD cd17374

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 60 120 180 240 300 360 420 480 540 629

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 8 of 8