https://www.alphaknockout.com

Mouse Srpx Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Srpx conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Srpx (NCBI Reference Sequence: NM_016911 ; Ensembl: ENSMUSG00000090084 ) is located on Mouse X. 10 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 10 (Transcript: ENSMUST00000044789). Exon 5 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Srpx gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-388P19 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice either heterozygous or homozygous for a knock-out allele display increased sensitivity to malignant tumor formation at 7-12 months of age. In addition, homozygotes exhibit atypical lymphocyte morphhology and splenomegaly.

Exon 5 starts from about 37.86% of the coding region. The knockout of Exon 5 will result in frameshift of the gene. The size of intron 4 for 5'-loxP site insertion: 8345 bp, and the size of intron 5 for 3'-loxP site insertion: 3152 bp. The size of effective cKO region: ~627 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 5 10 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Srpx Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7127bp) | A(28.82% 2054) | C(20.84% 1485) | T(30.6% 2181) | G(19.74% 1407)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chrX - 10059330 10062329 3000 browser details YourSeq 38 1142 1194 3000 75.7% chr8 + 68105091 68105131 41 browser details YourSeq 37 1150 1194 3000 83.4% chr4 - 84857970 84858011 42 browser details YourSeq 35 1158 1194 3000 97.3% chr3 + 81471726 81471762 37 browser details YourSeq 35 1158 1194 3000 97.3% chr14 + 8546439 8546475 37 browser details YourSeq 34 1158 1194 3000 97.3% chr14 - 122051519 122051624 106 browser details YourSeq 34 1152 1196 3000 77.0% chr1 - 43138846 43138884 39 browser details YourSeq 33 1159 1195 3000 88.9% chr13 - 101654288 101654323 36 browser details YourSeq 33 1158 1194 3000 97.2% chr11 - 115028573 115028621 49 browser details YourSeq 33 1158 1194 3000 97.2% chr11 - 115039330 115039372 43 browser details YourSeq 33 1159 1194 3000 97.2% chr10 - 85542686 85542724 39 browser details YourSeq 33 1158 1194 3000 94.6% chr8 + 49034716 49034752 37 browser details YourSeq 33 1159 1194 3000 97.3% chr11 + 4013489 4013596 108 browser details YourSeq 32 1158 1193 3000 94.5% chr2 - 54619126 54619161 36 browser details YourSeq 32 1158 1194 3000 97.1% chr13 - 53072464 53072500 37 browser details YourSeq 32 1158 1193 3000 94.5% chr18 + 54220208 54220243 36 browser details YourSeq 32 1158 1194 3000 97.1% chr13 + 99756996 99757032 37 browser details YourSeq 32 1158 1193 3000 94.5% chr10 + 15557310 15557345 36 browser details YourSeq 31 1158 1194 3000 94.3% chr2 - 113819399 113819450 52 browser details YourSeq 31 1158 1194 3000 82.4% chr13 - 6398669 6398702 34

Note: The 3000 bp section upstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chrX - 10055703 10058702 3000 browser details YourSeq 133 393 543 3000 95.9% chr14 - 79328803 79328956 154 browser details YourSeq 132 394 542 3000 96.5% chr7 - 132553584 132553736 153 browser details YourSeq 132 391 547 3000 94.1% chr17 + 29544858 29545015 158 browser details YourSeq 131 394 543 3000 95.2% chr8 - 47496668 47496820 153 browser details YourSeq 131 394 547 3000 94.0% chr15 - 82297803 82297960 158 browser details YourSeq 129 225 533 3000 83.7% chr16 - 13341743 13341995 253 browser details YourSeq 129 394 548 3000 92.2% chr15 - 79893394 79893545 152 browser details YourSeq 128 397 546 3000 93.3% chr1 - 131450198 131450351 154 browser details YourSeq 128 399 546 3000 95.1% chr12 + 73484049 73484196 148 browser details YourSeq 125 397 546 3000 94.4% chr10 - 95663296 95663454 159 browser details YourSeq 125 400 663 3000 92.6% chr2 + 29801218 29801526 309 browser details YourSeq 124 394 538 3000 94.3% chrX - 162089543 162089690 148 browser details YourSeq 124 397 539 3000 97.1% chr2 - 104300830 104301190 361 browser details YourSeq 124 395 546 3000 91.3% chr17 + 74541626 74541782 157 browser details YourSeq 123 394 539 3000 93.4% chr10 + 128747272 128747415 144 browser details YourSeq 123 396 539 3000 94.3% chr10 + 54089572 54089719 148 browser details YourSeq 122 397 546 3000 93.0% chr16 + 92663397 92663546 150 browser details YourSeq 121 405 537 3000 96.9% chr2 - 68898012 68898164 153 browser details YourSeq 121 405 546 3000 94.2% chr19 + 4522331 4522479 149

Note: The 3000 bp section downstream of Exon 5 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Srpx sushi-repeat-containing protein [ Mus musculus (house mouse) ] Gene ID: 51795, updated on 24-Oct-2019

Gene summary

Official Symbol Srpx provided by MGI Official Full Name sushi-repeat-containing protein provided by MGI Primary source MGI:MGI:1858306 See related Ensembl:ENSMUSG00000090084 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as drs; DRS-1; DRS-2; drs-1; drs-2 Expression Broad expression in mammary gland adult (RPKM 23.2), bladder adult (RPKM 15.6) and 16 other tissues See more Orthologs human all

Genomic context

Location: X; X A1.1 See Srpx in Genome Data Viewer

Exon count: 10

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) X NC_000086.7 (10037977..10117640, complement)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) X NC_000086.6 (9615103..9694787, complement)

Chromosome X - NC_000086.7

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Srpx ENSMUSG00000090084

Description sushi-repeat-containing protein [Source:MGI Symbol;Acc:MGI:1858306] Gene Synonyms drs-1, drs-2 Location Chromosome X: 10,037,977-10,117,709 reverse strand. GRCm38:CM001013.2 About this gene This gene has 4 transcripts (splice variants), 181 orthologues, 38 paralogues, is a member of 1 Ensembl protein family and is associated with 8 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Srpx-201 ENSMUST00000044789.9 2477 464aa ENSMUSP00000047926.3 Protein coding CCDS30013 Q9R0M3 TSL:1 GENCODE basic APPRIS P2

Srpx-203 ENSMUST00000115544.8 1865 444aa ENSMUSP00000111206.2 Protein coding - A2BE45 TSL:1 GENCODE basic APPRIS ALT2

Srpx-202 ENSMUST00000115543.2 1218 380aa ENSMUSP00000111205.2 Protein coding - Q9R0M3 TSL:1 GENCODE basic

Srpx-204 ENSMUST00000147334.1 441 No protein - lncRNA - - TSL:1

99.73 kb Forward strand 10.04Mb 10.06Mb 10.08Mb 10.10Mb 10.12Mb Contigs BX005236.16 > (Comprehensive set... < Srpx-201protein coding

< Srpx-203protein coding

< Srpx-202protein coding

< Rpgr-202protein coding

< Srpx-204lncRNA

Regulatory Build

10.04Mb 10.06Mb 10.08Mb 10.10Mb 10.12Mb Reverse strand 99.73 kb

Regulation Legend CTCF Open Chromatin Promoter Promoter Flank

Gene Legend Protein Coding

merged Ensembl/Havana Ensembl protein coding

Non-Protein Coding

RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000044789

< Srpx-201protein coding

Reverse strand 79.71 kb

ENSMUSP00000047... Low complexity (Seg) Cleavage site (Sign... Superfamily Sushi/SCR/CCP superfamily SMART Sushi/SCR/CCP domain Pfam Sushi/SCR/CCP domain Domain of unknown function DUF4174

HYR domain PROSITE profiles HYR domain

Sushi/SCR/CCP domain PANTHER Sushi repeat-containing protein SRPX

PTHR46343 Gene3D 2.10.70.10 CDD Sushi/SCR/CCP domain

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend missense variant synonymous variant

Scale bar 0 40 80 120 160 200 240 280 320 360 400 464

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7