https://www.alphaknockout.com

Mouse Sec23ip Conditional Knockout Project (CRISPR/Cas9)

Objective: To create a Sec23ip conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering.

Strategy summary: The Sec23ip (NCBI Reference Sequence: NM_001029982 ; Ensembl: ENSMUSG00000055319 ) is located on Mouse 7. 19 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 18 (Transcript: ENSMUST00000042942). Exon 2 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Sec23ip gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP23-121G8 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Male mice homozygous for a null allele display reduced fertility with globozoospermia and impaired fertilization.

Exon 2 starts from about 5.28% of the coding region. The knockout of Exon 2 will result in frameshift of the gene. The size of intron 1 for 5'-loxP site insertion: 4880 bp, and the size of intron 2 for 3'-loxP site insertion: 2120 bp. The size of effective cKO region: ~1027 bp. The cKO region does not have any other known gene.

Page 1 of 7 https://www.alphaknockout.com

Overview of the Targeting Strategy

Wildtype allele gRNA region 5' gRNA region 3'

1 2 3 19 Targeting vector

Targeted allele

Constitutive KO allele (After Cre recombination)

Legends Exon of mouse Sec23ip Homology arm cKO region loxP site

Page 2 of 7 https://www.alphaknockout.com

Overview of the Dot Plot Window size: 10 bp

Forward Reverse Complement

Sequence 12

Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis.

Overview of the GC Content Distribution Window size: 300 bp

Sequence 12

Summary: Full Length(7527bp) | A(26.01% 1958) | C(19.89% 1497) | T(30.81% 2319) | G(23.29% 1753)

Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis.

Page 3 of 7 https://www.alphaknockout.com

BLAT Search Results (up)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 128746810 128749809 3000 browser details YourSeq 91 614 1813 3000 93.3% chr1 - 180962655 181116259 153605 browser details YourSeq 88 1701 1843 3000 91.6% chr1 + 194569193 194569336 144 browser details YourSeq 78 1701 1811 3000 88.5% chr1 - 87293398 87293506 109 browser details YourSeq 78 303 711 3000 72.2% chr3 + 94694996 94695139 144 browser details YourSeq 78 1701 1807 3000 85.8% chr1 + 6156110 6156210 101 browser details YourSeq 75 88 705 3000 81.4% chr1 + 9420320 9559156 138837 browser details YourSeq 74 1739 1834 3000 89.8% chr13 - 67790053 67790147 95 browser details YourSeq 74 1735 1834 3000 88.9% chr1 + 82331218 82331316 99 browser details YourSeq 73 1723 1814 3000 90.3% chr18 - 34793721 34793815 95 browser details YourSeq 72 1735 1843 3000 92.8% chr12 + 67779913 67780229 317 browser details YourSeq 71 1740 1831 3000 89.2% chr13 - 119118970 119119062 93 browser details YourSeq 71 566 682 3000 82.6% chr17 + 79113774 79113881 108 browser details YourSeq 70 1700 1811 3000 90.0% chr5 + 137724886 137724995 110 browser details YourSeq 69 615 711 3000 79.8% chr17 + 65631180 65631268 89 browser details YourSeq 68 1737 1835 3000 87.3% chr7 - 40208458 40208568 111 browser details YourSeq 67 1734 1817 3000 94.8% chr11 - 113736196 113736281 86 browser details YourSeq 66 1754 1831 3000 92.4% chr5 - 88741759 88741836 78 browser details YourSeq 64 623 711 3000 82.4% chrX - 50403243 50403328 86 browser details YourSeq 63 614 711 3000 81.6% chr13 + 61755134 61755223 90

Note: The 3000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

BLAT Search Results (down)

QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ------browser details YourSeq 3000 1 3000 3000 100.0% chr7 + 128750837 128753836 3000 browser details YourSeq 88 1003 1500 3000 89.2% chr7 - 97763389 97763891 503 browser details YourSeq 87 1003 1532 3000 71.5% chr6 - 125418460 125418590 131 browser details YourSeq 78 861 1101 3000 89.6% chr11 - 57822166 57822717 552 browser details YourSeq 74 1003 1101 3000 85.4% chr17 - 23589968 23590055 88 browser details YourSeq 72 1003 1101 3000 81.6% chr8 + 96315442 96315534 93 browser details YourSeq 71 1003 1160 3000 93.9% chr2 - 131226655 131227003 349 browser details YourSeq 71 1003 1101 3000 84.6% chr14 - 81320344 81320436 93 browser details YourSeq 70 1003 1102 3000 84.8% chr5 - 31960854 31960948 95 browser details YourSeq 70 1003 1102 3000 83.2% chr13 - 73156148 73156242 95 browser details YourSeq 70 1003 1102 3000 82.5% chr10 - 128185213 128185307 95 browser details YourSeq 70 1009 1121 3000 79.2% chr8 + 88235398 88235501 104 browser details YourSeq 70 1003 1513 3000 70.4% chr4 + 46441625 46441746 122 browser details YourSeq 68 1003 1161 3000 86.2% chr15 - 98461351 98461509 159 browser details YourSeq 68 1003 1102 3000 81.4% chr11 + 116044041 116044135 95 browser details YourSeq 67 1003 1102 3000 79.8% chr11 + 20126569 20126663 95 browser details YourSeq 66 1003 1097 3000 83.0% chr5 - 99244285 99244376 92 browser details YourSeq 66 1003 1097 3000 80.9% chr1 - 144968824 144968913 90 browser details YourSeq 66 1003 1097 3000 80.9% chr9 + 107067620 107067709 90 browser details YourSeq 65 1003 1102 3000 77.5% chrX - 117781577 117781670 94

Note: The 3000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found.

Page 4 of 7 https://www.alphaknockout.com

Gene and information: Sec23ip Sec23 interacting protein [ Mus musculus (house mouse) ] Gene ID: 207352, updated on 12-Aug-2019

Gene summary

Official Symbol Sec23ip provided by MGI Official Full Name Sec23 interacting protein provided by MGI Primary source MGI:MGI:2450915 See related Ensembl:ENSMUSG00000055319 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as p125; D7Ertd373e Expression Ubiquitous expression in testis adult (RPKM 24.2), thymus adult (RPKM 11.5) and 28 other tissues See more Orthologs human all

Genomic context

Location: 7 F3; 7 70.51 cM See Sec23ip in Genome Data Viewer

Exon count: 19

Annotation release Status Assembly Chr Location

108 current GRCm38.p6 (GCF_000001635.26) 7 NC_000073.6 (128744862..128784836)

Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 7 NC_000073.5 (135888384..135928350)

Chromosome 7 - NC_000073.6

Page 5 of 7 https://www.alphaknockout.com

Transcript information: This gene has 4 transcripts

Gene: Sec23ip ENSMUSG00000055319

Description Sec23 interacting protein [Source:MGI Symbol;Acc:MGI:2450915] Gene Synonyms D7Ertd373e, p125 Location Chromosome 7: 128,744,943-128,784,836 forward strand. GRCm38:CM001000.2 About this gene This gene has 4 transcripts (splice variants), 199 orthologues, 2 paralogues, is a member of 1 Ensembl protein family and is associated with 5 phenotypes. Transcripts

Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags

Sec23ip- ENSMUST00000042942.9 4505 998aa ENSMUSP00000035610.8 Protein coding CCDS21900 G3X928 TSL:1 201 GENCODE basic APPRIS P1

Sec23ip- ENSMUST00000206986.1 399 56aa ENSMUSP00000145911.1 Protein coding - A0A0U1RPB3 CDS 3' 204 incomplete TSL:2

Sec23ip- ENSMUST00000205856.1 444 97aa ENSMUSP00000145816.1 Nonsense mediated - A0A0U1RP39 CDS 5' 202 decay incomplete TSL:3

Sec23ip- ENSMUST00000206504.1 4604 No - Retained intron - - TSL:NA 203 protein

59.89 kb Forward strand 128.74Mb 128.75Mb 128.76Mb 128.77Mb 128.78Mb 128.79Mb (Comprehensive set... Sec23ip-201 >protein coding

Sec23ip-203 >retained intron

Sec23ip-202 >nonsense mediated decay

Sec23ip-204 >protein coding

Contigs < AC136741.24 Genes < Mcmbp-201protein coding< Gm44672-201lncRNA < n-R5s158-201rRNA (Comprehensive set...

< Mcmbp-202protein coding

Regulatory Build

128.74Mb 128.75Mb 128.76Mb 128.77Mb 128.78Mb 128.79Mb Reverse strand 59.89 kb

Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Transcription Factor Binding Site

Gene Legend Protein Coding

Ensembl protein coding merged Ensembl/Havana

Non-Protein Coding

processed transcript RNA gene

Page 6 of 7 https://www.alphaknockout.com

Transcript: ENSMUST00000042942

39.89 kb Forward strand

Sec23ip-201 >protein coding

ENSMUSP00000035... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Sterile alpha motif/pointed domain superfamily

SMART DDHD domain

Sterile alpha motif domain Pfam DDHD domain

Sterile alpha motif domain PROSITE profiles DDHD domain

PANTHER PTHR23509:SF4

PTHR23509 Gene3D Sterile alpha motif/pointed domain superfamily

All sequence SNPs/i... Sequence variants (dbSNP and all other sources)

Variant Legend

missense variant splice region variant synonymous variant

Scale bar 0 100 200 300 400 500 600 700 800 998

We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC.

Page 7 of 7