https://www.alphaknockout.com

Mouse Khsrp Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Khsrp conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Khsrp (NCBI Reference Sequence: NM_010613 ; Ensembl: ENSMUSG00000007670 ) is located on Mouse 17. 18 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 18 (Transcript: ENSMUST00000007814). Exon 2~4 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Khsrp gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-87A6 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit decreased susceptibility to HSV-1 infection.

Exon 2 starts from about 11.27% of the coding region. The knockout of Exon 2~4 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 2009 bp, and the size of intron 4 for 3'-loxP site insertion: 718 bp. The size of effective cKO region: ~1822 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele 5' gRNA region gRNA region 3'

1 2 3 4 5 6 7 8 18 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Khsrp Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. It may be difficult to construct this targeting vector.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(8322bp) | A(21.79% 1813) | C(24.96% 2077) | T(24.36% 2027) | G(28.9% 2405)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. Significant high GC-content regions are found. It may be difficult to construct this targeting vector.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 - 57029389 57032388 3000 browser details YourSeq 46 1041 1114 3000 80.0% chr11 - 23306771 23306841 71 browser details YourSeq 32 859 945 3000 77.2% chr3 - 134240125 134240201 77 browser details YourSeq 28 928 967 3000 76.7% chr2 + 180710235 180710267 33 browser details YourSeq 28 928 966 3000 76.7% chr11 + 20741520 20741551 32 browser details YourSeq 22 16 37 3000 100.0% chr2 - 29226141 29226162 22 browser details YourSeq 22 1855 1876 3000 100.0% chr12 - 74063680 74063701 22 browser details YourSeq 22 20 41 3000 100.0% chr11 - 63898725 63898746 22 browser details YourSeq 21 928 948 3000 100.0% chr2 - 32151049 32151069 21 browser details YourSeq 21 328 348 3000 100.0% chr11 - 57812206 57812226 21 browser details YourSeq 21 34 56 3000 95.7% chr1 - 77553744 77553766 23 browser details YourSeq 21 1261 1285 3000 92.0% chr3 + 88085591 88085615 25 browser details YourSeq 20 2197 2216 3000 100.0% chr1 + 84801992 84802011 20 browser details YourSeq 20 928 947 3000 100.0% chr1 + 9700232 9700251 20

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr17 - 57024567 57027566 3000 browser details YourSeq 551 467 2778 3000 89.9% chr4 - 143459657 143460313 657 browser details YourSeq 143 1205 1363 3000 95.6% chr11 - 87464926 87492164 27239 browser details YourSeq 143 1203 1363 3000 96.2% chr11 + 87275215 87275381 167 browser details YourSeq 141 1205 1363 3000 96.8% chr9 + 60300399 60300560 162 browser details YourSeq 140 1203 1363 3000 93.8% chr15 - 96897120 96897296 177 browser details YourSeq 139 1204 1363 3000 92.0% chr7 - 107431655 107431804 150 browser details YourSeq 138 1203 1363 3000 91.6% chr6 - 124940428 124940584 157 browser details YourSeq 138 1203 1363 3000 91.3% chr11 + 20075221 20075370 150 browser details YourSeq 138 1197 1363 3000 91.6% chr1 + 11309262 11309429 168 browser details YourSeq 137 1199 1363 3000 90.2% chr8 - 117062827 117062980 154 browser details YourSeq 137 1203 1362 3000 92.3% chr12 - 76692386 76692544 159 browser details YourSeq 137 1185 1513 3000 84.1% chr7 + 27177248 27177487 240 browser details YourSeq 136 1203 1363 3000 93.1% chr1 - 12803801 12803969 169 browser details YourSeq 136 1204 1364 3000 93.2% chr2 + 59727308 59727484 177 browser details YourSeq 135 1210 1362 3000 94.8% chr8 + 69857384 69857553 170 browser details YourSeq 135 1205 1363 3000 89.7% chr2 + 56166796 56166949 154 browser details YourSeq 134 1204 1364 3000 93.0% chr12 - 105796914 105797081 168 browser details YourSeq 134 1205 1363 3000 89.5% chr11 - 116504004 116504154 151 browser details YourSeq 134 1203 1360 3000 93.0% chr1 - 9901671 9901835 165

Note: The 3000 bp section downstream of Exon 4 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Khsrp KH-type splicing regulatory protein [ Mus musculus (house mouse) ] Gene ID: 16549, updated on 10-Sep-2019

Gene summary

Official Symbol Khsrp provided by MGI Official Full Name KH-type splicing regulatory protein provided by MGI Primary source MGI:MGI:1336214 See related Ensembl:ENSMUSG00000007670 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as Fbp2; Ksrp; Fubp2; 6330409F21Rik Expression Ubiquitous expression in adrenal adult (RPKM 54.0), thymus adult (RPKM 53.0) and 28 other tissues See more Orthologs human all

Genomic context

Location: 17 D; 17 29.63 cM See Khsrp in Genome Data Viewer

Exon count: 18

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 17 NC_000083.6 (57021049..57031507, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 17 NC_000083.5 (57160472..57170930, complement)

Chromosome 17 - NC_000083.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 5 transcripts

Gene: Khsrp ENSMUSG00000007670

Description KH-type splicing regulatory protein [Source:MGI Symbol;Acc:MGI:1336214] Gene Synonyms 6330409F21Rik, KSRP Location Chromosome 17: 57,021,051-57,031,522 reverse strand. GRCm38:CM001010.2 About this gene This gene has 5 transcripts (splice variants), 181 orthologues, 12 paralogues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Khsrp-201 ENSMUST00000007814.9 3978 748aa ENSMUSP00000007814.8 Protein coding CCDS50157 Q3U0V1 TSL:1 GENCODE basic APPRIS P2

Khsrp-205 ENSMUST00000233480.1 2698 721aa ENSMUSP00000156463.1 Protein coding - A0A3B2WCD8 GENCODE basic APPRIS ALT2

Khsrp-202 ENSMUST00000232759.1 806 238aa ENSMUSP00000156771.1 Protein coding - A0A3B2W465 CDS 5' incomplete

Khsrp-204 ENSMUST00000233246.1 757 No protein - Retained intron - - -

Khsrp-203 ENSMUST00000232765.1 593 No protein - Retained intron - - -

30.47 kb Forward strand 57.02Mb 57.03Mb 57.04Mb Gm17949-201 >protein coding (Comprehensive set...

Contigs CT571247.13 > Genes (Comprehensive set... < Gtf2f1-201protein coding < Khsrp-201protein coding < Slc25a41-201protein coding

< Khsrp-205protein coding < Slc25a41-202protein coding

< Khsrp-202protein coding

< Khsrp-204retained intron

< Khsrp-203retained intron

Regulatory Build

57.02Mb 57.03Mb 57.04Mb Reverse strand 30.47 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000007814

< Khsrp-201protein coding

Reverse strand 10.44 kb

ENSMUSP00000007... MobiDB lite Low complexity (Seg) Superfamily K Homology domain, type 1 superfamily SMART K Homology domain Pfam K Homology domain, type 1 Far upstream element-binding protein, C-terminal PROSITE profiles PS50084 PANTHER PTHR10288

PTHR10288:SF101 Gene3D K Homology domain, type 1 superfamily CDD cd00105

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 80 160 240 320 400 480 560 640 748

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7